Satindra

hi , i am
satindra
kathania .

Data Scientist and ML engineer

download resume

about me

Data Scientist-Transforming Data into Strategic, Impactful Solutions"

With a PhD in Life Sciences and a strong passion for problem-solving, I transitioned from experimental data analysis to data science. I specialize in extracting insights from raw data and transforming them into strategic actions. Proficient in Python, R, SQL, Excel, Power BI, and Tableau, I develop predictive models with over 90% accuracy and create compelling visualizations for decision-making. AWS Cloud Certified, with expertise in ETL processes on AWS and Azure, I bring a strong technical foundation. I’m eager to collaborate with forward-thinking organizations to deliver data-driven solutions that drive business impact.

Let’s connect and explore how I can help your organization unlock the full potential of its data.

phone

352-222-7372

email

satindra.kathania@gmail.com

Area of Interest

Here’s a glimpse of what I’m passionate about working on!



Data Analytics

I love telling a story. Making a beautiful and compelling presentation is one of my favorite skills.

More Details

In Data Analytics, I enjoy transforming data into actionable insights using various tools and techniques. The process includes: Data Collection Data Cleaning/Transformation Interpretation/Reporting Decision-Making & Optimization

Machine Learning

Machine learning is more than an API call to scikit-learn. I love the math and theory as well as the implementation.

More Details

In machine learning, I enjoy exploring algorithms, building models, feature selection, model evaluation & validation and applying them to real-world problems to derive valuable insights.

Model Deployment

I deploy machine learning models on cloud using REST APIs and building CICD pipeline

More Details

My focus in model deployment involves model serving frameworks, data pipeline management, monitoring & logging and using cloud platforms to ensure scalability, reliability, and efficient access to models via RESTful APIs.

Cloud Computing

I maintain servers for database storage, model training, and model deployment.

More Details

I leverage cloud technologies to optimize resource management, cost management, ensuring efficient performance for data storage and processing tasks, handling big-data tools, implementing security & compliance.

Time Series

Working with time series data for predictions and trends analysis.

More Details

I analyze time series data to identify patterns, forecast future values, and provide actionable insights for decision-making. I applied time series decomposition, smoothing techniques, forecasting models including ARIMA, Prophet, Statsmodels etc.

NLP

Applying Natural Language Processing to derive insights from textual data.I leverage NLP techniques to analyze text, extract information, and build models that understand and generate human language.

More Details

Used frameworks like NLTK, SpaCy, and Transformers, applied Linguistic Knowledge, Text Preprocessing (tokenization, stemming, lemmatization, and removing stop words) and feature engineering (TF-IDF), and word embeddings (e.g., Word2Vec, GloVe). Carried out NLP tasks, including sentiment analysis, named entity recognition (NER), machine translation, text classification, and question-answering.

Education

2006 - 2011

Ph.D in Life Sciences

IMTECH-JNU

Chandigarh, INDIA

- Designed and conducted research experiments, analyzed data using statistical methods and specialized techniques.
- Investigated PDE inhibitors’ role in apoptosis, applying advanced data analysis to derive insights.
- Interpreted complex lab results, identifying patterns, troubleshooting issues, and making data-driven decisions for further testing.
- Assisted PI with experiments, statistical reporting, manuscript writing, and presenting findings at lab meetings and conferences.
- Managed lab operations, overseeing data documentation, reporting, and training of students and technicians.
- Developed SOPs, conducted internal audits, and implemented CAPAs.
- Maintained precise documentation of experiments for regulatory reporting.

2003 - 2005

Master of Science

H.P University

Shimla, INDIA

- Molecular Biology Techniques: DNA/RNA extraction, PCR, gel electrophoresis, and cloning.
- Cell Biology & Tissue Culture: Cell culturing, maintenance of cell lines, and primary cell isolation.
- Protein Expression & Purification: Protein isolation, Western blotting, ELISA, and mass spectrometry.
- Bioprocessing & Fermentation: Scaling up cell cultures and fermentation techniques.
- Bioinformatics & Data Analysis: Handling large biological datasets.
- Statistical Analysis: Applying statistical methods to biological data.

2000 - 2003

Bachelor of Science

C.C.S University

Meerut, INDIA

- Microbial Techniques: Culturing and isolation of microorganisms.
- Biochemical Testing: Conducted biochemical assays and antibiotic sensitivity testing.
- Molecular Biology Basics: Gained foundational knowledge in DNA/RNA extraction, PCR.
- Microbial Growth & Control: Studied microbial growth kinetics and control methods.
- Laboratory Documentation & Reporting: Maintained accurate lab notebooks and prepared reports.
- Research Experiment Design: Designed experiments and analyzed results using basic statistical methods.

Technical skills

Python

90%

SQL

85%

R programing

85%

Excel

90%

Tableau

80%

Power BI

85%

Cloud

85%

Information Technology

90%

Soft skills

Critical Thinking

90%

Problem solving

95%

Communication

90%

colloboration

90%

Adaptibility

90%

Attention to detail

85%

Time management

97%

Curiosity

98%

experience

  • Oct 2024 - Present

    Data Science Fellow

    Correlation One- DS4A

    Dallas, USA

    Branding
    portfolio
    career coaching

  • 2016 - 2018

    associate design director

    ACI Learning Tech Academy

    Dallas, USA

    - Strong understanding of ITIL principles for incident management, problem resolution, and service request fulfillment in IT operations.
    - Proficient in installing, configuring, and managing operating systems (Windows, Linux, Unix) and virtualization technologies (VMware, Hyper-V) for efficient IT infrastructure support.
    - Skilled in troubleshooting hardware issues and managing network devices (routers, switches, firewalls) to ensure smooth IT operations.
    - Expertise in user account management, group policies, and directory services (Active Directory, OpenLDAP) to manage access control and security.
    - Familiarity with scripting (PowerShell, Bash, Python) to automate tasks, enhance operational efficiency, and support security automation.
    - Experience in implementing and managing network security solutions, including firewalls, IDS/IPS, vulnerability assessments, and ensuring compliance with security protocols.
    - Proficient in cloud services (AWS, Azure), networking (DNS, TCP/IP), and identity and access management (IAM), with a focus on maintaining IT security and operational resilience.

    Jan 2024-Sept 2024

    IT Security Analyst-Training

  • 2019 - 2024

    Data Science Practitioner

    Self Learning/Career Transition

    Dallas, USA

    - Statistical Analysis & Hypothesis Testing: Applied statistical methods such as correlation analysis, hypothesis testing, and probability theory to solve real-world problems.
    - Data Wrangling & Preprocessing: Gained expertise in cleaning, transforming, and preparing structured and unstructured data for analysis using Python libraries like pandas and NumPy.
    - Predictive Modeling: Built predictive models using supervised and unsupervised machine learning techniques, such as linear regression, classification, and clustering (e.g., K-means, decision trees, random forests).
    - Data Visualization: Created insightful data visualizations using tools like matplotlib, seaborn, and Power BI to communicate findings effectively.
    - SQL for Data Retrieval: Mastered SQL to query and manipulate databases for efficient data retrieval and analysis.
    - Real-World Projects: Worked on industry-specific projects, applying data science techniques to solve business challenges and create actionable insights.
    - Collaboration & Presentation: Collaborated in teams to tackle data-driven projects, honing skills in communication, presentation, and reporting of complex results to non-technical stakeholders.

  • 2012 - 2013

    UI/UX designer

    K21 Academy

    harrow, UK

    - Deployed a static website using Amazon S3 for storage and CloudFront for content delivery.Managed DNS routing using AWS Route 53. Implemented security best practices by setting up IAM policies to control access.
    - Configured AWS Budgets to monitor account spending, set budget alerts, and analyze cost trends using AWS Cost Explorer. Optimized resources to reduce costs based on identified usage patterns.
    - Developed and deployed a serverless application using AWS Lambda and integrated it with API Gateway for API management. Used DynamoDB as the database backend and monitored application performance with CloudWatch.
    - Designed and deployed a high-availability system using EC2, Auto Scaling, and Elastic Load Balancer (ELB). Configured a multi-AZ RDS database for high availability and automatic backups using S3.
    - Built a secure multi-tier architecture in AWS VPC with web servers in public subnets and database servers in private subnets. Used CloudFormation to automate the creation of infrastructure components, including NAT Gateways and Security Groups.
    - Automated deployment using AWS CodePipeline, CodeBuild, and CodeDeploy, integrating it with GitHub. Deployed applications to EC2 instances with automated testing and monitoring.
    - Built an ETL pipeline using AWS Glue to extract data from S3, transform it with Glue jobs, and store the transformed data back in S3. Queried and analyzed data using Amazon Athena for business insights.
    - Set up a real-time streaming pipeline with Amazon Kinesis and processed the data using AWS Lambda.Stored processed data in S3 and analyzed it using Redshift for reporting.
    - Loaded data into Amazon Redshift from S3 using AWS Glue. Designed and optimized a data warehouse schema for performance. Used Amazon QuickSight for visualizing business insights from the data.
    - Automated the provisioning and configuration of EC2 instances using shell scripts for software installation, firewall settings, and cron jobs for backup scheduling.
    - Set up Amazon CloudWatch to monitor server metrics (CPU, memory, disk usage) on EC2 instances. Configured CloudWatch Alarms to send notifications based on threshold limits.
    - Automated backup processes for EC2 instances using snapshots and S3 for data storage. Created scripts to restore EC2 instances from snapshots and backup archives.

    2020 - 2024

    Cloud Practitioner

  • 2012 - 2014

    Research Associate

    University of Florida

    Florida, USA

    - Led and participated in scientific research projects, performing experiments and managing timelines to achieve research objectives.
    - Conducted statistical analysis, data visualization, and reporting to interpret research findings in receptor tyrosine kinase signaling.
    - Developed new techniques based on experimental data and ensured accurate documentation of procedures, results, and quality control for regulatory compliance.
    - Created research manuscripts, reports, and presentations for publication and conferences, communicating findings effectively.
    - Trained lab personnel in research methods, data analysis, and lab safety, fostering collaboration within the team.
    - Troubleshot technical issues, conducted risk assessments, and implemented strategies to improve research quality and minimize risks.
    - Validated lab methods, equipment, and software platforms, ensuring accuracy and adherence to regulatory standards.

Certifications and Profile

To explore my courses, certifications, and other professional profiles, please click the button below.
This will take you to my profiles, where you can view detailed information about the projects I've completed,
with various certifications in Data Analytics, cloud computing and IT. Additionally, you can find my other online
technical courses taken, which showcase my skills and accomplishments, providing a comprehensive overview of my
qualifications and expertise in the field. Your interest in my work is greatly appreciated,
and I hope you find the information valuable!

tableau
Tableau
credly
Credly
RPubs
Rpubs
github
github

Other Technical CourseWork

Data Analytics Professional Certificate | Google via Coursera
Data Science Specialization | John Hopkins University via Coursera
Data Science Specialization
Power BI Financial Reporting & Financial Analysis | Udemy
Machine Learning, Data Science and Generative AI with Python | Udemy
Intermediate-Advance SQL, Python | Kaggle
AWS SysOps Administrator Associate Certification
AWS Cloud Practitioner Certification (Renewed)
AWS Solution Architect Associate Certification
AWS Certified Data Analytics Specialty 2023 | Udemy
Ultimate AWS Certified Developer Associate 2021 (DVA-C02) | Udemy
AWS Certified DevOps Engineer Professional 2021 (DOP-C02) | Udemy

portfolio

I invite you to click the link below. My GitHub repository showcases a variety of projects that demonstrate my skills in data analysis,
machine learning, and data visualization. Each project includes detailed documentation that outlines the problem statements,
methodologies, and results, allowing you to understand my thought process and technical abilities.
By reviewing these projects, you will gain insight into my practical experience and commitment to applying data science concepts effectively.
Your interest in my work is valued, and I hope these projects help illustrate my potential as a data scientist!


This project focuses on the analysis of diabetes data to predict the likelihood of diabetes in patients. It involves data preparation, visualization, and the application of machine learning models to create a predictive framework that can assist healthcare professionals in identifying high-risk patients.


The project starts by getting data from Youtube API and exploring YouTube video data, followed by data preprocessing and feature engineering. Several data analysis techniques are applied to understand patterns in the dataset, such as the relationship between views, likes, and comments. Machine learning models are then built to predict video performance metrics such as the number of views and engagement levels. Finally, insights from the analysis are presented in the form of visualizations and key takeaways for content strategy improvement.


The project starts with the ingestion of financial data into a Power BI environment, followed by data transformation and the application of DAX functions to generate financial insights. Key financial metrics are calculated, including profit margins, revenue growth, expense ratios, and trend analysis over time. Visualizations in Power BI help users understand trends, anomalies, and areas of improvement.


The project begins by exploring the dataset through data cleaning and preprocessing. Then, exploratory data analysis (EDA) is performed to understand trends and relationships between features such as job satisfaction, salary, department, and attrition. Next, predictive models are built using machine learning techniques to predict employee attrition. Finally, the models are evaluated, and improvements are made to optimize performance.

This project involves building a pipeline for real estate price prediction. It begins with data exploration and feature engineering, followed by the application of machine learning algorithms such as linear regression, decision trees, and random forests. The project also evaluates model performance and explores ways to improve accuracy, ultimately delivering a predictive model for real estate pricing.


The Pizza Sales Data Analysis project is designed to analyze pizza sales data, providing actionable insights to improve sales performance and business operations. The project includes data import, SQL querying, data cleaning, and visualization using Microsoft tools such as MS SQL Server and Power BI. The objective is to create a comprehensive analysis pipeline that helps a pizza business optimize sales strategies, predict customer behavior, and generate insightful reports.


contact me

satindra kathania

data scientist / ML engineer

phone

352-222-7372

email

satindra.kathania@gmail.com

website

to be updated