python - Iterating over rows to add items to dictionary -


i have dataframe column contains lists. want a) find unique values of lists b) make dictionary format {uniquevalue : [indexa, indexb,...]}, indices correspond index of dataframe row contains uniquevalue.

i have done a, code b creates dictionary has indexes, regardless if contained in row or not. please help?

import pandas pd   df = pd.read_excel(io = 'links.xlsx')  unique_list = [] row in df['relevant_links']:     row_list = row.split(sep = ', ')     unique_list.extend(row_list)  unique_set = set(unique_list)  unique_dict = dict.fromkeys(unique_set, [])  print(unique_dict.keys())  row_idx = 0 row in df['relevant_links']:                 [unique_dict[i].append(row_idx) in str(row).split(', ') if in unique_dict]     row_idx += 1 

i think can use:

df = pd.dataframe({'relevant_links':['a, c, v','a, r, e','e, t','e, r']}) print (df)   relevant_links 0        a, c, v 1        a, r, e 2           e, t 3           e, r  #create series  s = df['relevant_links'].str.split(', ', expand=true).stack() #groupby unique links, create list , dict unique_dict = s.reset_index(name='val').groupby('val')['level_0'].apply(list).to_dict() print (unique_dict) {'v': [0], 't': [2], 'r': [1, 3], 'e': [1, 2, 3], 'a': [0, 1], 'c': [0]}  unique_set = s.unique().tolist() print (unique_set) ['a', 'c', 'v', 'r', 'e', 't'] 

Comments