Power BI Retail ETL Project Part 1: Importing and Combining the Online Retail II Dataset
๐ Retail ETL Pipeline Series – Part 1
Power BI ETL Project: Importing and Combining the Online Retail II Dataset
One Dataset. Seven ETL Journeys. Endless Possibilities.
Welcome to Part 1 of the Retail ETL Pipeline Series.
In this series, we will transform a real-world retail dataset into a complete Data Engineering portfolio using multiple architectures and technologies.
The journey starts with the simplest and most beginner-friendly approach:
๐ Power BI Only ETL
๐ Why This Dataset?
Many beginners spend months learning Power BI, SQL, Python, and Cloud technologies but never get the chance to work on a realistic business dataset.
That changes today.
We are using the famous Online Retail II Dataset, a real e-commerce dataset containing more than one million transaction records collected over two years.
- ๐ Real Retail Transactions
- ๐ Revenue Analysis
- ๐ฅ Customer Analytics
- ๐ Multiple Countries
- ๐ Multiple Years
- ๐ฐ Business Intelligence Use Cases
This is exactly the kind of data that helps learners move from tutorials to real projects.
๐ฏ The Goal of Today’s ETL
Before creating dashboards, KPIs, and analytics, we need a clean and structured dataset.
Our objective is simple:
- Import both Excel sheets
- Combine the data
- Prepare the staging layer
- Create a clean foundation for future transformations
๐ Step 1 – Load the Dataset into Power BI
Open Power BI Desktop and import the Online Retail II Excel file.
Power BI will display two worksheets:
- ๐ Year 2009–2010
- ๐ Year 2010–2011
Many beginners ask:
๐ค Should We Select One Sheet or Both?
✅ Select BOTH sheets.
The dataset is split across two years, and our goal is to build a complete retail analytics project.
If we only load one sheet, we lose a significant portion of the business history.
๐ธ Power BI Navigator Screen
Insert your screenshot below this section:
๐ท Insert Power BI Navigator Screenshot Here
(Year 2009–2010 and Year 2010–2011 selected)
⚡ Important Decision: Load or Transform?
When the Navigator screen appears, Power BI provides two options:
- Load
- Transform Data
For a professional ETL workflow:
✅ Choose "Transform Data"
We are building a Data Engineering project, not just importing data.
All cleaning and transformations should happen inside Power Query.
๐ Building a Real Data Engineering Structure
One of the biggest mistakes beginners make is creating everything in a single query.
Professional Data Engineering follows layers.
We will use the same structure across Power BI, PostgreSQL, BigQuery, dbt and Airflow projects.
This layered architecture makes projects easier to maintain, easier to debug and more aligned with industry practices.
๐งน What Will We Clean Next?
After combining the two sheets, we will begin cleaning the data.
Some common issues include:
- ❌ Missing Customer IDs
- ❌ Cancelled Orders
- ❌ Invalid Quantities
- ❌ Invalid Prices
- ❌ Duplicate Records
- ❌ Incorrect Data Types
Each step will be documented so learners can understand not only what we are doing but also why we are doing it.
๐ก Why DETLENG Uses One Dataset Multiple Times
Most learners collect datasets.
We build solutions.
At DETLENG, the focus is not on touching hundreds of datasets.
The focus is on mastering architecture, ETL thinking and business problem solving.
๐ One Dataset. Seven Architectures.
- Power BI ETL
- BigQuery + Power BI
- BigQuery + Looker Studio
- Python + PostgreSQL + Power BI
- Python + BigQuery + Power BI
- dbt + BigQuery + Looker Studio
- Airflow + BigQuery + dbt + Power BI
๐ Final Thoughts
Today’s lesson may look simple:
Import an Excel file. Select two sheets. Open Power Query.
But this is where every professional ETL pipeline begins.
Every dashboard, warehouse, cloud platform, and analytics solution starts with a clean and well-structured dataset.
Take your time. Follow along. Build the project yourself. Break things. Fix things. Learn by doing.
❤️ Build. Learn. Engineer. Grow.
This is the DETLENG way.
Practical Learning • Real Projects • Real Skills
๐ DETLENG.COM
