Data Scientist Intern

Flexon Technologies Talent360.ai • Internship • Pleasanton, CA, US • 2d ago

Data Scientist Intern — Predictive ML, Time Series & Deep Learning

We’re looking for a curious, hands-on Data Scientist intern to help build and evaluate predictive machine-learning solutions for time-series problems. You’ll work closely with product and engineering teams to design experiments, preprocess real-world time-series data, and prototype models — from classical statistics to deep-learning architectures (RNNs / LSTMs). This is a learning-first role with real impact: your models will help drive business decisions and product features.

Key responsibilities

Collect, clean, and explore time-series and panel datasets; handle missing data, irregular sampling, and seasonality.
Design and implement predictive models (ARIMA, SARIMAX, Prophet, XGBoost/LightGBM, etc.) and deep-learning models (RNN, LSTM, GRU).
Build end-to-end model pipelines for training, validation, and evaluation (cross-validation for time-series, rolling windows).
Feature engineering for time series: lag features, rolling/window aggregates, trend/seasonality decomposition, calendar/epoch features.
Evaluate models with appropriate metrics for forecasting and classification (MAE, RMSE, MAPE, precision/recall, ROC AUC where applicable).
Collaborate with engineers to productionize prototypes or create reproducible experiments (notebooks, scripts, model checkpoints).
Document experiments, assumptions, and results; present concise findings to stakeholders.
Stay current with literature and experiment with state-of-the-art architectures and strategies (attention mechanisms, sequence-to-sequence, ensembling, etc.).

Required qualifications

Currently pursuing (or recently completed) BS/MS in Data Science, Computer Science, Statistics, Applied Math, or related field.
Strong fundamentals in statistics, probability, and predictive modeling.
Practical experience with Python and key libraries: NumPy, pandas, scikit-learn.
Familiarity with deep-learning frameworks (TensorFlow/Keras or PyTorch) and experience implementing RNN/LSTM models.
Hands-on with time-series modeling concepts (stationarity, differencing, seasonality, autocorrelation).
Experience with data visualization and exploratory analysis (Matplotlib, Seaborn, Plotly).
Good communication skills and ability to explain technical work to non-technical stakeholders.