Hi, I'm Daniel! If you're reading this, there's a chance that you may be hiring a data analyst, project manager, or any of the wide variety of roles that exist in between. If so, scroll down to find out why, in my humble opinion, you should consider hiring me!
Email: dmindlin824@gmail.com | Phone: 818-665-8871
Discover My WorkBelow is a chronological look at my experience, covering key data roles, small-scale project coordination responsibilities, and an overall perspective on how analytics can transform practical problems into real solutions.
Senior Data Analyst
Venturing into the healthcare realm, I constructed SQL/Python pipelines that cleansed and modeled large-scale data for Elevance and Aetna, boosting data accuracy by ~20%. In my junior capacity, I shouldered day-to-day planning responsibilities—coordinating minor sprints, aligning tasks with the business team, and ensuring that each data milestone was appropriately validated. My Power BI and Tableau dashboards supported multi-million-dollar strategic initiatives, improving efficiency by ~15%.
I also implemented iterative transformations with dbt (a data build tool) to streamline data extraction, raising processing efficiency another ~15–20%. During this period, I learned to blend standard regression and hypothesis testing (pandas, scikit-learn, SciPy) into everyday workflows, accelerating data-driven decisions at the C-level. Although I was an associate-level contributor, I actively planned pipeline enhancement tasks to ensure on-time delivery and a coherent data journey from ingestion to final insights.
Senior Data Analyst (Contract)
As a contract Senior Data Analyst working alongside a tight-knit team, I focused on forecasting solutions using Python libraries like pandas, scikit-learn, and XGBoost, cutting forecast errors by about 10%. I introduced SAS-based accuracy checks to refine existing forecasting processes, and used dbt in conjunction with Airflow on a modest AWS/Snowflake environment for ~20–25% overall workflow efficiency gains.
My responsibilities included running short weekly stand-ups, setting small deadlines, and verifying deliverables matched stakeholder expectations. This experience taught me how iterative improvements and consistent communication can elevate forecast fidelity—even in a local or mid-scale environment.
Senior Data Analyst (Contract)
At Royal Caribbean Group, I devised SQL/Python revenue models for a fleet of 26 ships, elevating demand projection accuracy by ~15–20%. I also scripted Python-based ticket pricing updates, reducing manual input by ~30–40% and enhancing operational efficiency by ~10–15%. In my associate role, I tracked each deliverable's progress (like implementing advanced window functions) in short sprints, ensuring code quality and data integrity before deployment.
This environment prioritized fast decisions, so I coordinated with operations to confirm pipeline readiness each week. By bridging data engineering tasks and simple project management duties, I helped deliver near real-time fare adjustments and quickly capitalized on new revenue opportunities.
Analytics Consultant
Taking on a consulting role at CVS Health, I worked with SQL (including advanced window functions, CTEs) and Python-based libraries (pandas, scikit-learn) to optimize multi-team data workflows on local or minimal hardware setups. My duties involved clarifying each ingestion or transformation task in a backlog and verifying final outputs within specified deadlines.
By deploying scalable data pipelines (Airflow, dbt), I reduced manual data handling, allowing teams to focus on strategic data usage. I further integrated ML models (TensorFlow, XGBoost) for forecasting. While others determined the vision, I collaborated closely with them, ensuring my small-scale management approach kept daily tasks on target, culminating in a better overall pipeline for timely analytics.
Consulting Analyst
At Qvest.US, I stepped into an associate-level data analyst role supporting SQL-driven data pipelines for cross-department Tableau/Power BI dashboards. These pipelines fostered real-time insights for sales and operations. My self-organized “micro-projects” for each enhancement ensured tasks remained bite-sized and trackable, so stakeholders saw incremental gains every two weeks.
I also ran market and competitive analyses with Python scripts, enabling data-backed strategy formation. Although I juggled typical junior analytics tasks, I found that minimal but structured project planning (like short stand-ups and Gantt charts) significantly boosted visibility and maintained progress across concurrent tasks.
University of California, Santa Barbara
Graduated: August 2020
University of San Diego
Graduated: May 2021
SQL, Python (pandas, NumPy, scikit-learn, TensorFlow)
Tableau, Power BI, Plotly, Matplotlib, Seaborn
Spark, Snowflake, dbt, Airflow, AWS, Azure, BigQuery
ETL/ELT, API Integration, Web Scraping
Classification Models, Forecasting, NLP
Salesforce
Salesforce
Amazon Web Services
Project Overview: In this personal exercise, I aimed to uncover emotional trends in a massive collection of tweets. By meticulously cleaning text data and employing logistic regression, I explored how sentiment fluctuates over time in response to key events or viral topics.
Process: After normalizing tweets (removing special characters, tokenizing words, applying TF-IDF), I trained the model with cross-validation to optimize accuracy. This approach highlighted how strongly negative tweets often spiked around controversial subjects. Additionally, an overlay of retweet volume revealed that viral negativity often garners outsized engagement.
Results & Insights: The final model exceeded 90% accuracy on validation data, confirming its reliability in gauging overall sentiment. These insights pointed to a correlation between polarizing content and higher user engagement, underscoring how emotional language shapes social media dynamics.
Visualization: Average Sentiment & Retweet Volume
The left axis here displays sentiment (-1 to +1), while the right axis
captures total retweet volume. Spikes in negativity consistently align with
heavy retweet activity, illustrating how emotive content proliferates faster online.
Project Overview: Using detailed sales data covering multiple product lines over two years, I developed a forecasting system to anticipate monthly demand shifts. The goal was to reduce last-minute rush shipping and better align promotional timing.
Process: I cleaned the dataset (removing anomalies), introduced features like promotional flags and day-of-week indicators, and tested multiple modeling strategies. An ensemble of Prophet (for seasonality) and RandomForestRegressor (for non-linear interactions) outperformed baseline ARIMA, validated via rolling window back-testing.
Results & Insights: The final model cut mean absolute error by ~17% over naive methods, enabling managers to reorder more precisely. Detailed error analysis revealed certain categories exhibited irregular surges around holidays or marketing pushes, highlighting the need for real-time forecast updates during those periods.
Visualization: Actual & Forecasted Monthly Sales (Confidence Bands)
Bars show actual sales across three product categories, while lines depict
predicted volumes. The shaded areas around Category A’s forecast illustrate
a 95% confidence interval, highlighting potential variance in demand.
Project Overview: Merging session logs with transaction histories, I aimed to cluster users based on their buying patterns, frequency, and average order values. The resulting segments guided marketing teams toward more effective loyalty and upsell strategies.
Process: I engineered features like recency, cart abandonment rates, and total spend, then compared algorithms (K-Means, DBSCAN, hierarchical) using silhouette scores. K-Means with five clusters offered the best interpretability. Each segment was labeled by characteristic behaviors—for instance, “Mid-Freq, Mid-Spend” or “High-Freq, High-Spend,” enabling targeted approaches.
Results & Insights: Marketing campaigns tailored to each cluster boosted email open rates by ~18%. Observing user migration across segments also helped detect churn risk or ascendant buying patterns. The net outcome: a more nuanced view of consumer habits, fueling data-driven retargeting and retention efforts.
Visualization: Five Clusters by Purchase Frequency & Average Spend
A color-coded scatter plot reveals the distinct groups,
with each cluster demonstrating unique spending and frequency metrics.
Labeled centroids help pinpoint typical buyer behaviors,
informing more nuanced marketing.
Project Overview: I developed a pipeline to identify and visualize anomalies in IoT sensor data in near real-time. By proactively spotting erratic readings, the system provided early warnings of potential hardware failures or hazardous conditions in connected devices (e.g., smart thermostats or industrial machinery).
Process: I configured a local environment using Python for ingestion and quick ETL tasks, alongside Kafka for data queuing. Each sensor feed transmitted temperature, vibration, and humidity metrics. An isolation forest model (an algorithm that flags outliers in multi-dimensional data) identified unusual spikes or dips. A threshold-based approach triggered Slack alerts when sensor values breached normal operating ranges.
Results & Insights: The pipeline flagged incremental deviations that manual checks would likely miss, cutting mean response time to abnormal events by 40%. Cluster-based analysis revealed that anomalies often involved simultaneous temperature and vibration jumps, reinforcing the importance of correlating sensor variables. Overall, this system helped detect mechanical issues earlier, reducing downtime and repair costs.
Visualization: Temperature & Vibration with Threshold Lines
The chart above shows both temperature (left axis) and vibration (right axis),
with dashed lines marking upper and lower thresholds. Any data points that cross
these thresholds or exhibit extreme combined behaviors are flagged as anomalies,
highlighting critical events in real-time.
Project Overview: In this self-directed project, I explored classification techniques to identify potentially fraudulent transactions. Since fraud typically accounts for a tiny percentage of all credit card activity, my primary challenge was dealing with this highly imbalanced dataset to ensure legitimate transactions weren’t frequently misclassified while still catching true fraud.
Process: First, I split the transaction data into training and test sets. I then introduced specialized methods for imbalanced learning, such as SMOTE (Synthetic Minority Over-Sampling Technique) to replicate fraud examples more evenly, and undersampling of legitimate transactions to maintain a workable ratio. I ran multiple experiments using RandomForestClassifier and XGBoost with a heavy focus on the ROC–AUC (Receiver Operating Characteristic–Area Under the Curve) as my primary evaluation metric, because it illustrates how well the model distinguishes fraud from legitimate transactions at various thresholds.
Results & Insights: My final ensemble model achieved ~98% ROC–AUC, reducing missed fraud (false negatives) significantly. I also analyzed false positives (legitimate transactions flagged as fraud), finding patterns like legitimate overseas travel or sporadic big-ticket purchases. By applying domain knowledge to these borderline cases, I refined the pipeline to avoid inconveniencing users who legitimately exhibit “abnormal” patterns. Overall, this approach showcased how careful handling of minority classes can deliver high-confidence fraud alerts without overwhelming call centers or negatively impacting customer experiences.
Visualization: Confusion Matrix & ROC Curve
Here, you can see a confusion matrix (a table comparing predicted vs. actual classes),
which helps measure how many transactions were correctly or incorrectly categorized. A False Positive
occurs when a legitimate transaction is wrongly flagged as fraud, while a False Negative happens
if a fraudulent transaction is incorrectly labeled as legitimate. On the right is the ROC curve
(Receiver Operating Characteristic curve), plotting the True Positive Rate (the fraction of all
fraud that is correctly caught) against the False Positive Rate (the fraction of all legitimate
transactions that are incorrectly flagged). The further the ROC curve pushes toward the top-left, the
better the model is at distinguishing fraud from legitimate transactions across different thresholds.
This curve’s Area Under the Curve (AUC) is a consolidated measure of the model’s overall
separating power.
I appreciate your interest in my portfolio. If you have a data challenge that could benefit from local, Python-based development and thorough analytics, please reach out to learn more or to discuss possible collaborations.
Email: dmindlin824@gmail.com
Phone: 818-665-8871
LinkedIn:
linkedin.com/in/daniel-mindlin