Skip to content

Latest commit

 

History

History
40 lines (29 loc) · 1.94 KB

README.md

File metadata and controls

40 lines (29 loc) · 1.94 KB

Big Data Project - TV production company

Table of Contents

Introduction

This project focuses on processing and analyzing large datasets for a television production company. The data, exceeding 20 million records, originates from diverse sources:

  • User contract information and interaction data (TXT files)
  • User log watching history (JSON files)
  • User log search history (Parquet files)

Data is retrieved from various storage solutions, including MySQL, Azure SQL, and the local file system. Subsequently, it undergoes transformation and organization into structured insight tables within a PostgreSQL database.

Data Snapshots

User Contract & Interaction Data

User Contract Inforamtion & Interaction Data

User Watch History Data

User Watch History Data

User Log Search Data

User Log Search Data

Technologies

  • AzureSQL
  • MySQL
  • Python
  • Apache Spark
  • PostgreSQL

Project Files