• Skip to main content
  • Skip to header right navigation
  • Skip to site footer

My Online Training Hub

Learn Dashboards, Excel, Power BI, Power Query, Power Pivot

  • Courses
  • Pricing
    • Free Courses
    • Power BI Course
    • Excel Power Query Course
    • Power Pivot and DAX Course
    • Excel Dashboard Course
    • Excel PivotTable Course – Quick Start
    • Advanced Excel Formulas Course
    • Excel Expert Advanced Excel Training
    • Excel Tables Course
    • Excel, Word, Outlook
    • Financial Modelling Course
    • Excel PivotTable Course
    • Excel for Customer Service Professionals
    • Excel for Operations Management Course
    • Excel for Decision Making Under Uncertainty Course
    • Excel for Finance Course
    • Excel Analysis ToolPak Course
    • Multi-User Pricing
  • Resources
    • Free Downloads
    • Excel Functions Explained
    • Excel Formulas
    • Excel Add-ins
    • IF Function
      • Excel IF Statement Explained
      • Excel IF AND OR Functions
      • IF Formula Builder
    • Time & Dates in Excel
      • Excel Date & Time
      • Calculating Time in Excel
      • Excel Time Calculation Tricks
      • Excel Date and Time Formatting
    • Excel Keyboard Shortcuts
    • Excel Custom Number Format Guide
    • Pivot Tables Guide
    • VLOOKUP Guide
    • ALT Codes
    • Excel VBA & Macros
    • Excel User Forms
    • VBA String Functions
  • Members
    • Login
    • Password Reset
  • Blog
  • Excel Webinars
  • Excel Forum
    • Register as Forum Member

Remove Duplication|VBA & Macros|Excel Forum|My Online Training Hub

You are here: Home / Remove Duplication|VBA & Macros|Excel Forum|My Online Training Hub
Avatar
sp_LogInOut Log In sp_Registration Register
sp_Search Search
Advanced Search|Last Search Results
Search
Forum Scope




Match



Forum Options



Minimum search word length is 3 characters - maximum search word length is 84 characters
sp_Search Search
sp_RankInfo
Lost password?
sp_CrumbsHome HomeExcel ForumVBA & MacrosRemove Duplication
sp_PrintTopic sp_TopicIcon
Remove Duplication
Avatar
Bill Jone
Member
Members
Level 0
Forum Posts: 32
Member Since:
July 17, 2019
sp_UserOfflineSmall Offline
1
October 1, 2020 - 5:22 am
sp_Permalink sp_Print

Hi,

I have Codes exceed 4Million and I want to remove the duplication so I put it in 4 columns due the excel have count limitation to be added in one column.
So can help me to remove the duplication from all columns at one time.

Thanks;

Bill

Avatar
Mynda Treacy
Admin
Level 10
Forum Posts: 4450
Member Since:
July 16, 2010
sp_UserOfflineSmall Offline
2
October 1, 2020 - 9:03 am
sp_Permalink sp_Print

Hi Bill,

Why don't you use Power Query to do this for you? Presumably the data is in a CSV or Text file in one column. You can use Power Query to get the data from the CSV file:

1. Excel 2010 & 2013 go to the Power Query tab/Excel 2016 onward go to the Data tab 

2. Get Data from Text/CSV

3. Browse to the file location and import it.

4. In the Power Query editor window's the Home tab > Remove Rows > Remove Duplicates > Close & Load.

Mynda

Avatar
Bill Jone
Member
Members
Level 0
Forum Posts: 32
Member Since:
July 17, 2019
sp_UserOfflineSmall Offline
3
October 1, 2020 - 7:14 pm
sp_Permalink sp_Print sp_EditHistory

Power-Query-1.pngImage Enlarger

Hi Mynda,

Thank you very much for your fast reply.

Could you please help me about this, I'm using Excel 2010 and I can't find Power Query tab, please check attached image and let me know how I can get it.

Thanks;

Bill

sp_PlupAttachments Attachments
  • sp_PlupImage Power-Query-1.png (145 KB)
Avatar
Mynda Treacy
Admin
Level 10
Forum Posts: 4450
Member Since:
July 16, 2010
sp_UserOfflineSmall Offline
4
October 1, 2020 - 8:07 pm
sp_Permalink sp_Print sp_EditHistory

Hi Bill,

In Excel 2010 Power Query is a free add-in that you can download here: https://www.microsoft.com/en-u.....x?id=39379

When installed you'll have a dedicated Power Query tab on the ribbon.

Mynda

Avatar
Bill Jone
Member
Members
Level 0
Forum Posts: 32
Member Since:
July 17, 2019
sp_UserOfflineSmall Offline
5
October 10, 2020 - 2:05 am
sp_Permalink sp_Print sp_EditHistory

Hi Mynda,

Please accept my gratitude. Your support means a lot to me!.

I got Power Query and I got how to remove the duplication as you see in attached image but I still need your help to know how I can download or save in text file  after the duplication data removed.

Excel.pngImage Enlarger

 

Thanks;

Bill 

sp_PlupAttachments Attachments
  • sp_PlupImage Excel.png (221 KB)
Avatar
Philip Treacy
Admin
Level 10
Forum Posts: 1518
Member Since:
October 5, 2010
sp_UserOfflineSmall Offline
6
October 26, 2020 - 2:03 pm
sp_Permalink sp_Print

Hi Bill,

When your data is ready click on Close & Load in the top left of the query editor. NOTE click on the text Close & Load, not the icon.

From the sub-menu click on Close & Load To.  Then choose to load to a table on an existing or new worksheet.

If you want to save the data out as text from there you can do so.

Regards

Phil

Avatar
Bill Jone
Member
Members
Level 0
Forum Posts: 32
Member Since:
July 17, 2019
sp_UserOfflineSmall Offline
7
October 28, 2020 - 9:44 pm
sp_Permalink sp_Print sp_EditHistory

Deep thanks Philip.

I did this but when I save data as text, I see after I save it as text, the count not correct.

It is supposed the count is 3,407,267 but the file text contain1,048,576. please check attached images.

 

Counts.pngImage Enlarger

2020-10-28-13_31_57-Window.pngImage Enlarger
2020-10-28-13_46_40-Window-1.pngImage Enlarger

 

Thanks;

Bill

sp_PlupAttachments Attachments
  • sp_PlupImage Counts.png (74 KB)
  • sp_PlupImage 2020-10-28-13_31_57-Window.png (93 KB)
  • sp_PlupImage 2020-10-28-13_46_40-Window-1.png (85 KB)
Avatar
Philip Treacy
Admin
Level 10
Forum Posts: 1518
Member Since:
October 5, 2010
sp_UserOfflineSmall Offline
8
October 29, 2020 - 12:38 pm
sp_Permalink sp_Print

Hi Bill,

How do you know the count is wrong?

Can you please share the Excel workbook and the source data file(s) so I can check.

Thanks

Phil

Avatar
Catalin Bombea
Iasi, Romania
Admin
Level 10
Forum Posts: 1810
Member Since:
November 8, 2013
sp_UserOfflineSmall Offline
9
October 29, 2020 - 1:55 pm
sp_Permalink sp_Print sp_EditHistory

Hi Bill,

1048576 is the maximum number of rows in excel.

Obviously, we cannot add in a sheet more that this number of records, so we have to find an alternative.

Here i what you can do:
1. Add a line in the query to split the table in 1.000.000 records buckets.

split=Table.Split(#"PreviousStepName",1000000),

2. Convert this list to table:

#"Converted to Table" = Table.FromList(split, Splitter.SplitByNothing(), null, null, ExtraValues.Error),

3. Add an index column, starting from 1, step 1.

#"Added Index" = Table.AddIndexColumn(#"Converted to Table", "Index", 1, 1, Int64.Type),

4. Expand Column1. This action will regenerate the initial data, but will have now a paging column. If the data has over 3 million rows, you will have in that column numbers from 1 to 4, each page will have 1.000.000 records, as we set in step 1.

5. Instead of Load To a table in a worksheet, choose to load to a Pivot Table Report.

6. In this pivot table, add a slicer for the column we added for paging. This will allow you to see data in pages, 1 million at a time.
You can also show report pages, that will create 1 sheet for each page.

Avatar
Catalin Bombea
Iasi, Romania
Admin
Level 10
Forum Posts: 1810
Member Since:
November 8, 2013
sp_UserOfflineSmall Offline
10
October 29, 2020 - 2:27 pm
sp_Permalink sp_Print

Example attached.

You can set the csv file path and group size to split into as many pages you need.

https://1drv.ms/x/s!AjfS33R8yo.....A?e=IQF9dC

File size is too large, choose the Open In Desktop App option.

sp_Feed
Go to top
Forum Timezone: Australia/Brisbane
Most Users Ever Online: 245
Currently Online:
Guest(s) 8
Currently Browsing this Page:
1 Guest(s)
Top Posters:
SunnyKow: 1432
Anders Sehlstedt: 871
Purfleet: 412
Frans Visser: 346
David_Ng: 306
lea cohen: 219
Jessica Stewart: 205
A.Maurizio: 202
Aye Mu: 201
jaryszek: 183
Newest Members:
stuart burge
Bruce Tang Nian
Scot C
Othman AL MUTAIRI
Misael Gutierrez Sr.
Attif Ihsan
Kieran Fee
Murat Hasanoglu
Brett Dryland
Saeed Aldousari
Forum Stats:
Groups: 3
Forums: 24
Topics: 6223
Posts: 27295

 

Member Stats:
Guest Posters: 49
Members: 31920
Moderators: 3
Admins: 4
Administrators: Mynda Treacy, Philip Treacy, Catalin Bombea, FT
Moderators: MOTH Support, Velouria, Riny van Eekelen
© Simple:Press —sp_Information

Sidebar

Blog Categories

  • Excel
  • Excel Charts
  • Excel Dashboard
  • Excel Formulas
  • Excel PivotTables
  • Excel Shortcuts
  • Excel VBA
  • General Tips
  • Online Training
  • Outlook
  • Power Apps
  • Power Automate
  • Power BI
  • Power Pivot
  • Power Query
microsoft mvp logo
trustpilot excellent rating
Secured by Sucuri Badge
MyOnlineTrainingHub on YouTube Mynda Treacy on Linked In Mynda Treacy on Instagram Mynda Treacy on Twitter Mynda Treacy on Pinterest MyOnlineTrainingHub on Facebook
 

Company

  • About My Online Training Hub
  • Disclosure Statement
  • Frequently Asked Questions
  • Guarantee
  • Privacy Policy
  • Terms & Conditions
  • Testimonials
  • Become an Affiliate

Support

  • Contact
  • Forum
  • Helpdesk - For Technical Issues

Copyright © 2023 · My Online Training Hub · All Rights Reserved. Microsoft and the Microsoft Office logo are trademarks or registered trademarks of Microsoft Corporation in the United States and/or other countries. Product names, logos, brands, and other trademarks featured or referred to within this website are the property of their respective trademark holders.