How to split a list into multiple lists using Python
Learn how to split a list into multiple lists based on values in a column in a CSV or Excel file using a Python Script.
If you deal with data as part of your business, you must have stumbled upon this problem where you want to split a list into multiple lists. Many of you look for a solution in Excel or Google Sheets, but a solution in Python is a lot more convenient, especially when you have access to online spaces like Jupyter Notebooks or Google Colab.
Problem Statement: How to split a list into multiple lists based on a column in a spreadsheet
What is the required input to split a list into multiple lists using Python?
A CSV file consisting of all the data you want to split into multiple lists.
What is the output of the Python Script?
- Separate Lists as Multiple CSVs.
- Single Excel (XLSX) file with multiple tabs.
Steps to follow in the Python Script
- Enter the name of the input file.
- Select the column you want to use to split the file.
- Select the output format (CSV or XLSX).
- If the output format is CSV, look for CSV files named after values in the column used for splitting the file.
- If the output format is XLSX, look for an XLSX file with tabs named after values in the column used for splitting the file.
Python script to split a list into multiple lists based on a column in a spreadsheet
import pandas as pd
filename = input("Enter file to be split: ")
try:
df = pd.read_csv(filename, na_values={'Owner':None})
except:
print("Please enter a valid file name and try again!")
exit()
print('\nSelect the index corresponding to the column you want to use to split the file:')
i = 0
for item in df.columns:
print (i, "-", item)
i = i + 1
raw_index = input ('\nSelect the index: ')
print ('\nFile', filename, 'will be split using', df.columns[int(raw_index)], 'as splitter.')
split_df = df.groupby(df.columns[int(raw_index)])
raw_file_type = input ("\nOutput Format:\n 0 - CSV\n 1 - XLSX\nPick the output file type: ")
if int(raw_file_type) == 0:
print("\nInput file", filename, "will be split into multiple CSVs named as: ")
for owner, owner_df in split_df:
print(owner+'.csv')
owner_df.to_csv(owner+'.csv', index = False)
else:
writer = pd.ExcelWriter('Output.xlsx')
print("\nInput file", filename, "will be split into multiple tabs in Output.xlsx with the following sheet names: ")
for splitter, splitter_df in split_df:
print (splitter)
splitter_df.to_excel(writer, sheet_name = splitter, index = False)
writer.save()
print('\n')