Help me build an interactive dashboard to analyze user retention data:Data sources:- User registration data (users.csv)- User activity log (activity.csv)Functional requirements:- Display 7-day, 30-day, and 90-day retention rate curves- Group comparison by user source (advertising, natural traffic, recommendation)- Time range filter- Bar charts showing weekly active user trendsTechnology stack: Use React + Recharts to generate a web pagethat can be run directly.
Help me analyze this sales data (sales_2024.csv):1. Give me an overview of the data (number of rows, columns, missing values)2. Generate descriptive statistics (mean, median, standard deviation)3. Identify outliers4. Do correlation analysis to see which factors affect sales5. Draw distribution charts and trend charts of key indicators6. Summarize 3-5 key findings
Use this customer churn data (churn_data.csv) to train aprediction model:1. Data preprocessing (handle missing values, standardize numerical features)2. Feature engineering (generate useful new features)3. Train several models (logistic regression, random forest, XGBoost)4. Compare model performance (precision, recall, F1, AUC)5. Generate feature importance analysis6. Give me the prediction code of the best model that can predict new customers
Help me set up automatic monitoring:- Check this BigQuery data table every morning at 9am- Send an alert if daily active users are 20% lower than the 7-day average- Send an alert if the error rate exceeds 5%- Generate daily data summary and send to my email
Scheduled automations are available for Pro/Max plan users.
Perform customer segmentation analysis on this data:[Upload customer_data.csv]Data includes:- Demographics (age, location, income)- Purchase history (frequency, value, recency)- Engagement metrics (email opens, website visits)Tasks:1. Perform RFM analysis (Recency, Frequency, Monetary)2. Use K-means clustering to identify 4-5 customer segments3. Profile each segment (characteristics, behaviors)4. Visualize segments with scatter plots and radar charts5. Recommend marketing strategies for each segment6. Export segment assignments for use in CRM
Create a sales forecasting model:Data: monthly_sales.csv (3 years of historical data)Requirements:1. Decompose time series (trend, seasonality, residuals)2. Check for stationarity (ADF test)3. Train multiple models: - ARIMA - Prophet (Facebook's forecasting tool) - LSTM (if patterns are complex)4. Compare model performance (RMSE, MAE, MAPE)5. Forecast next 6 months6. Create confidence intervals7. Visualize historical data + forecasts with interactive chartExplain which model performed best and why.
Analyze results from our A/B test:[Upload ab_test_results.csv]Test details:- Control: Current checkout flow- Variant: New one-click checkout- Metrics: Conversion rate, average order value, completion time- Sample size: 10,000 users per variantAnalysis needed:1. Calculate statistical significance (p-value, confidence intervals)2. Check for sample ratio mismatch3. Analyze by user segments (new vs. returning, device type)4. Calculate practical significance (effect size)5. Estimate revenue impact if we roll out variant6. Visualize results with clear charts7. Provide go/no-go recommendationBe conservative with statistical interpretation.
Build a cohort retention analysis:Data: user_activity.csv with user_id, signup_date, activity_dateAnalysis:1. Group users by signup month (cohorts)2. Calculate retention for each cohort at: - Day 1, 7, 14, 30, 60, 903. Create cohort retention heatmap4. Identify which cohorts have best retention5. Analyze if there are seasonal patterns6. Compare cohorts before/after a major feature launch (June 2024)7. Build interactive dashboard to explore different cohort groupingsHelp me understand what's driving retention differences.
Help me optimize this slow SQL query:[Paste SQL query]Database: PostgreSQLTable size: 50M rowsCurrent execution time: 45 secondsPlease:1. Explain what the query is doing2. Identify performance bottlenecks3. Suggest optimizations (indexes, query rewrite, etc.)4. Explain the execution plan5. Provide the optimized version6. Estimate expected performance improvementAlso suggest what indexes I should create.
Create an automated weekly analytics report:Data sources:- PostgreSQL database (user events)- Google Analytics (via API)- Stripe (via API for revenue data)Report sections:1. Executive Summary - Key metrics vs. previous week - Notable changes (flag >10% changes)2. User Growth - New signups (daily trend) - Activation rate - Growth rate by channel3. Engagement - DAU/WAU/MAU trends - Feature usage breakdown - Session duration analysis4. Revenue - MRR and growth rate - New vs. expansion vs. churn - Customer LTV by cohortOutput:- HTML report with embedded charts- PDF version for distribution- Send via email every Monday 8amAutomate this to run weekly. (Pro/Max plan)
Set up data quality checks:Dataset: user_events table (BigQuery)Checks to implement:1. Completeness - No null values in critical fields - Expected row count (±20% of 7-day average)2. Uniqueness - No duplicate event_ids - User_id format validation3. Timeliness - Events processed within 1 hour - No data gaps > 30 minutes4. Validity - Timestamps in reasonable range - Numeric fields within expected bounds - Category values match allowed listAlert me if any check fails. Run checks every hour.Provide dashboard showing data quality trends.
Help me build a feature store for ML models:Raw data: user_profiles.csv, transactions.csv, events.csvFeatures to engineer:1. User features: - Total transactions (all time) - Average transaction value - Days since last transaction - Transaction frequency (per month) - Preferred categories2. Behavioral features: - Page views last 7/30 days - Session count last 7/30 days - Engagement score (custom calculation)3. Temporal features: - Day of week patterns - Time of day patternsRequirements:- Update features daily- Store in format ready for model training- Handle missing values appropriately- Include feature documentation- Version control for featuresCreate pipeline that computes and updates these features.
Create an interactive dashboard where users can:- Filter by date range with a slider- Toggle between different metrics- Hover to see detailed values- Click on segments to drill down- Export current view as PNGUse Plotly or Recharts for interactivity.
I have a 10GB CSV file that's too large to process in memory.Help me:1. Process it in chunks2. Perform aggregations efficiently3. Create summary statistics4. Sample the data for visualization5. Identify outliers without loading everythingUse appropriate tools (dask, polars, or chunking strategies).
This dashboard query is taking 30+ seconds to load.Current query: [paste SQL]Please optimize by:1. Identifying unnecessary joins2. Suggesting appropriate indexes3. Rewriting with better structure4. Using materialized views if appropriate5. Implementing caching strategyGoal: <5 second load time