david dada logodavid dada logodavid dada logodavid dada logo
    • Home
    • Projects
      • Wildfire Analytics Dashboard
      • Movie Review Sentiment
      • WalletWatch – Mobile Finance App
      • Customer Behaviour Prediction
      • Detecting AI-Generated Text
      • Speech to Code
      • Web-App Portfolio
      • Airplane Simulator
    • About
    • Contact
      WalletWatch – Mobile Finance App
      December 12, 2023
      Wildfire Analytics Dashboard
      May 20, 2024
      Published by David on December 14, 2023
      Categories
      • Uncategorized
      Tags

      NYC Open City Data – Data Ingestion Pipeline

      ♢
      pipeline-banner copy
      Summary

      Deployed data ingestion pipeline that preprocesses and ingests data from a data lake into a data warehouse.

      Skills

      ETL Data Pipelining • Preprocessing • Docker • Data Ingestion • Data Warehouse • Scalability • Postgres • SQL • Data Lake

      Key Highlights

      Real Data: Using real open city data.

      Scalability: Pipeline deployed in Docker which contributes to scalability and reproducibility.

      Preprocessing: Preprocessing the data so it conforms to the data warehouse schema. This includes cleaning, dropping columns, etc. (Data cleansing/sanitation)

      Data Management Efficiency: Deployed PgAdmin instance to manage ingested data.

      Interested in the full project ?

      Project Code
      Share

      Related posts

      May 20, 2024

      Wildfire Analytics Dashboard


      Read more
      December 12, 2023

      WalletWatch – Mobile Finance App


      Read more
      December 2, 2023

      Movie Review Sentiment


      Read more

      Reach Me:
      Contact Form

      Find Me:
      Linkedin
      Github