-
Hello all, I have two dataframes and I'd like to update the values in one dataframe based on the values of another dataframe, if the values of a shared column are equivalent. In Pandas, it looks like this:
We can see that in df1 rows where ID = [3,7] now have the values from df2. How would one do this in Ibis? I have tried the following:
|
Beta Was this translation helpful? Give feedback.
Replies: 3 comments
-
hi, I have a method for my daily routine, you may try: @classmethod
def update_table(cls, t: Table, right:Union[Table, pd.DataFrame],
on:str, col_to_update:str, right_col_updater:str)->Table:
"""updates a column of a table with another table or dataframe
t: ibis table, which is the latest modified one
right: ibis table or pandas dataframe, which will update the table
on: str, column name to be used for joining on
col_to_update: str, column name to be updated in the table
right_col_updater: str, column name to be used for updating
"""
# check if t is ibis table
if not isinstance(t, Table):
raise ValueError("t must be ibis table")
# check if right is ibis table or pandas dataframe
if not isinstance(right, (Table, pd.DataFrame)):
raise ValueError("right must be ibis table or pandas dataframe")
# convert right to ibis table if it is pandas dataframe
if isinstance(right, pd.DataFrame):
right = ibis.memtable(right)
# check if on is in t and right
if on not in t.columns:
raise ValueError(f"{on} not in t")
if on not in right.columns:
raise ValueError(f"{on} not in right")
# check if col_to_update is in t
if col_to_update not in t.columns:
raise ValueError(f"{col_to_update} not in t")
# check if right_col_updater is in right
if right_col_updater not in right.columns:
raise ValueError(f"{right_col_updater} not in right")
right:Table = right[on, right_col_updater]
right = right.rename({"REPLACER_delete_me":right_col_updater})
t = t.join(right, predicates=on, rname="{name}_delete_me", how='left')
t = (t
.mutate(ibis.ifelse(
_["REPLACER_delete_me"].notnull(),
_["REPLACER_delete_me"], _[col_to_update]
).name(col_to_update)
)
).drop(sel.matches(".*_delete_me"))
return t |
Beta Was this translation helpful? Give feedback.
-
this may also be useful: @classmethod
def alter_table(cls, t:Table, condition, replace_dict:dict)->Table:
"""replaces values in a table and returns the ibis table
parameters:
t: ibis table, which is the latest modified one
condition: ibis condition, e.g. (t.col=="value"), warning: conditions must called with t
replace_dict: dict, keys are column names, values are strings to be replaced
"""
# check if cols exists
cols = list(replace_dict.keys())
t_cols = t.columns
if set(cols).difference(t_cols) != set():
raise ValueError(f"columns {set(cols).difference(t_cols)} not in the table")
REPLACEMENTS = replace_dict
return (t
.mutate(
sel.across(
REPLACEMENTS.keys(),
lambda v: ibis.ifelse(
condition, REPLACEMENTS[v.get_name()], v
)
)
)
) |
Beta Was this translation helpful? Give feedback.
-
I would suggest a left outer join on
|
Beta Was this translation helpful? Give feedback.
I would suggest a left outer join on
ID
, followed by selecting theValue
column from the right side of the join: