Comparing values in Pandas with "diff" and "pct_change"

Поделиться
HTML-код
  • Опубликовано: 20 окт 2024

Комментарии • 4

  • @AndyWallWasWeak
    @AndyWallWasWeak 7 месяцев назад +1

    off topic Q, but inspired by a moment from this vid. When writing own functions, should I write for pd.Series or 1-column DataFrame? Sounds like could be multiple things to keep in mind when deciding

    • @ReuvenLerner
      @ReuvenLerner  7 месяцев назад +1

      I'd suggest not writing anything for a 1-column data frame. Either write for a data frame (regardless of size), or for a series / column. I think that the latter is probably a better way to go, overall.

  • @tyl9680
    @tyl9680 5 месяцев назад +1

    What about diff by different categories? Say I have corn, rice, beans and wheat prices in the same df, and I want to compare the price changes within the same catogories.

    • @ReuvenLerner
      @ReuvenLerner  5 месяцев назад +1

      You can totally do this! Just use "diff" on the result of a "groupby". For example:
      df = DataFrame({'category': ['wheat', 'corn', 'rice', 'wheat', 'corn', 'rice', 'wheat', 'corn', 'rice'],
      'price': [10, 8, 6, 11, 7, 5, 15, 9, 4]})
      df.groupby('category')[['price']].diff()
      You'll get a new data frame back (thanks to the double square brackets around 'price'), showing the difference for each row from the previous occurrence of that category. However, if you want to know which category is which, you'll probably want to join it back to the original data frame:
      df.groupby('category')[['price']].diff().join(df, rsuffix='_df')