A batch analytics platform with a 3-layer data engineering pipeline (Raw → Staging → Analytics) that analyzes trending GitHub repositories across 3 programming languages (Python, TypeScript with Next.js >= 16, and Go). Leverages Render Workflows' distributed task execution to process data in parallel, storing results in a dimensional model for high-performance analytics.
- Multi-Language Analysis: Tracks Python, TypeScript/Next.js, and Go repositories
- 3-Layer Data Pipeline: Raw ingestion → Staging validation → Analytics dimensional model
- Parallel Processing: 4 concurrent workflow tasks using Render Workflows SDK
- Render Ecosystem Spotlight: Dedicated showcase for Render-deployed projects
- Real-time Dashboard: Next.js 14 dashboard with analytics visualizations
- Hourly Updates: Automated cron job triggers workflow execution
graph TD
A[Cron Job Hourly] --> B[Workflow Orchestrator]
B --> C[Python Analyzer]
B --> D[TypeScript Analyzer]
B --> E[Go Analyzer]
B --> F[Render Ecosystem]
C --> G[Raw Layer JSONB]
D --> G
E --> G
F --> G
G --> H[Staging Layer Validated]
H --> I[Analytics Layer Fact/Dim]
I --> J[Next.js Dashboard]
Backend (Workflows)
- Python 3.11+
- Render Workflows SDK with
@taskdecorators - asyncpg for PostgreSQL
- aiohttp for async API calls
- GitHub REST API
Frontend (Dashboard)
- Next.js 14 (App Router)
- TypeScript
- Tailwind CSS
- Recharts for visualizations
- PostgreSQL (pg)
Infrastructure
- Render Workflows (task execution)
- Render Cron Job (hourly trigger)
- Render Web Service (Next.js dashboard)
- Render PostgreSQL (data storage)
trender/
├── workflows/
│ ├── workflow.py # Main workflow with @task decorators
│ ├── github_api.py # Async GitHub API client
│ ├── connections.py # Shared resource management
│ ├── metrics.py # Momentum/activity calculations
│ ├── render_detection.py # Render usage detection
│ ├── etl/
│ │ ├── extract.py # Raw layer extraction
│ │ ├── transform.py # Staging transformations
│ │ ├── load.py # Analytics layer loading
│ │ └── data_quality.py # Quality scoring
│ └── requirements.txt
├── trigger/
│ ├── trigger.py # Cron trigger script
│ └── requirements.txt
├── dashboard/
│ ├── app/ # Next.js App Router pages
│ ├── lib/
│ │ └── db.ts # Database utilities
│ └── package.json
├── database/
│ ├── schema/
│ │ ├── 01_raw_layer.sql
│ │ ├── 02_staging_layer.sql
│ │ ├── 03_analytics_layer.sql
│ │ └── 04_views.sql
│ └── init.sql
├── render.yaml
├── .env.example
└── README.md
- GitHub App with client ID and client secret
- Render account
- Node.js 18+ (for dashboard)
- Python 3.11+ (for workflows)
git clone <your-repo-url>
cd trendercp .env.example .env
# Edit .env with your credentialsRequired variables:
GITHUB_CLIENT_ID: Your GitHub App client IDGITHUB_CLIENT_SECRET: Your GitHub App client secretDATABASE_URL: PostgreSQL connection stringRENDER_WORKFLOW_ID: Workflow ID (after deployment)RENDER_API_KEY: Render API key
- Go to Render Dashboard
- Create new PostgreSQL database named
trender - Note the connection string for
DATABASE_URL
# Connect to your Render PostgreSQL instance
psql $DATABASE_URL -f database/init.sqlThis will create:
- Raw layer tables (
raw_github_repos,raw_repo_metrics) - Staging layer tables (
stg_repos_validated,stg_render_enrichment) - Analytics layer tables (dimensions and facts)
- Analytics views for dashboard queries
# Install Render Workflows SDK
pip install render-sdk
# Deploy workflow
cd workflows
render-workflows deploy workflow.py
# Note the WORKFLOW_ID from the outputSet environment variables in Render Workflows dashboard:
GITHUB_CLIENT_IDGITHUB_CLIENT_SECRETDATABASE_URL
The render.yaml file defines:
- Web Service: Next.js dashboard
- Cron Job: Hourly workflow trigger
- Database: PostgreSQL instance
Deploy to Render:
# Push to GitHub and connect to Render
# Or use Render Blueprint buttonAfter deploying, update the cron job environment variables:
RENDER_WORKFLOW_ID: Your workflow ID from step 5RENDER_API_KEY: Your Render API key
# Manual trigger via Render Workflows CLI
render-workflows trigger <WORKFLOW_ID>
# Or trigger via API
cd trigger
python trigger.pyOnce the workflow completes, access your dashboard at:
https://trender-dashboard.onrender.com
- Stores complete GitHub API responses
- Tables:
raw_github_repos,raw_repo_metrics - Purpose: Audit trail and reprocessing capability
- Cleaned and validated data
- Tables:
stg_repos_validated,stg_render_enrichment - Data quality scoring (0.0 - 1.0)
- Business rules applied
- Dimensions:
dim_repositories,dim_languages,dim_render_services - Facts:
fact_repo_snapshots,fact_render_usage,fact_workflow_executions - Views: Pre-aggregated analytics for dashboard
The workflow consists of 8 tasks decorated with @task:
main_analysis_task: Orchestrator that spawns parallel tasksfetch_language_repos: Fetches repos for Python, TypeScript, Goanalyze_repo_batch: Analyzes repos in batches of 10fetch_render_ecosystem: Fetches Render-related projectsanalyze_render_projects: Analyzes Render-specific featuresaggregate_results: ETL pipeline execution (Extract → Transform → Load)store_execution_stats: Records workflow performance metrics
- Star Velocity:
(stars_last_7_days / total_stars) * 100 - Activity Score: Weighted formula using commits, issues, contributors
- Momentum Score:
(star_velocity * 0.4) + (activity_score * 0.6) - Render Boost: 1.2x multiplier for projects using Render
- Freshness Penalty: 0.9x for repos older than 180 days
cd workflows
pip install -r requirements.txt
python workflow.pycd dashboard
npm install
npm run dev
# Access at http://localhost:3000psql $DATABASE_URL -f database/schema/01_raw_layer.sql
psql $DATABASE_URL -f database/schema/02_staging_layer.sql
psql $DATABASE_URL -f database/schema/03_analytics_layer.sql
psql $DATABASE_URL -f database/schema/04_views.sqlTechnical:
- Process 300+ repos across 3 languages in under 10 seconds
- 3x speedup vs sequential processing
- 99%+ success rate on workflow runs
- Data quality score >= 0.90 for 95%+ repositories
Marketing:
- Showcase 50+ Render ecosystem projects
- Track Render adoption vs competitors
- Identify case study candidates
MIT
Contributions welcome! Please open an issue or submit a pull request.