TUTORIAL: Archive Google Analytics (UA) Data using Python

Поделиться
HTML-код
  • Опубликовано: 12 сен 2024

Комментарии • 66

  • @probabilistically
    @probabilistically  3 месяца назад

    **** UPDATES *****
    1. I recommend that you run the code in Google Colab, especially, if you run into library installation issues. There is nothing to install in Colab, and it works very well.
    2. I have updated the code to include pagination. Useful if you need to archive your UA data before it gets deleted in July 2024. New GitHub link: github.com/tanyazyabkina/Archive_Google_Analytics_UA_data/tree/main

  • @kevinhoang9074
    @kevinhoang9074 2 года назад +5

    It’s a shame informative channels like these don’t have more views,likes and subs thanks for your work 🙏

  • @jivandibiase7886
    @jivandibiase7886 3 года назад +1

    Thank you very much, very informative video. I can see this channel growing exponentially soon as more prople will be interested in data!

  • @michellewhitehead6540
    @michellewhitehead6540 2 года назад +1

    Wow, you've just saved me a dozen hours' work. Thank you so much!

  • @Desleiden
    @Desleiden 2 года назад +2

    Hello, just want to say thank you because you saved me a lot of time with this code!!

  • @japanmasa4085
    @japanmasa4085 2 года назад +1

    Thank you very much from Japan. It's very useful.

  • @hardroko
    @hardroko 2 года назад +2

    Thank you very much for sharing! ... I did not find this information anywhere or in Spanish, so I will try to reproduce it and translate it into Spanish if I have some time.

  • @hichomsky1
    @hichomsky1 3 года назад +2

    This video and the guide code helped a lot, thanks a lot Tanya C:

  • @FinanceNik
    @FinanceNik 3 года назад +2

    Very interesting tutorial indeed! Thank's for sharing!

  • @shambhavitiwari9718
    @shambhavitiwari9718 2 года назад +1

    Hi,
    Your video is very crisp and precise
    However I wanted to fetch about 6 dimensions along with a some metrics.
    This list of dimensions that I want to extract is : country,date,devicecategory,hour,channel grouping,page path, landing page path
    And metrics : sessions ,page views, bounces , entrances , product revenue, quantity,transaction revenue , transactions and product checkouts.
    When I try to fetch all metrics and only the dimensions - country,date and device category I get all the correct values for the metrics,we are checking against GA web interface,(this also works if I get all metrics and dimensions like - country, date, device category ,hour, channel grouping this works for all view ids except the one with maximum values).
    And if I try to include the page path and landing page path dimensions the metrics give lesser values as in GA web interface.
    I have tried checking that if at all there is any sampling,but there isn't any.
    After that I have also tried using page sessions and page token [here page token takes next page token as a value and page sessions takes 10000 as value]

    • @probabilistically
      @probabilistically  2 года назад +1

      Hi! Are you having issues with the number of rows? If so, yes you would need to use pagination with a token. There is also sampling available in these reports.
      Check out this guide:
      developers.google.com/analytics/devguides/reporting/core/v4/rest/v4/reports/batchGet

  • @thaylasilva4744
    @thaylasilva4744 2 года назад

    This video helped me a lot, I just wanted to say thank you!

  • @Nurlan_Turganov
    @Nurlan_Turganov 2 года назад +1

    Thank you, for you knowledge!

  • @arabtechonologies3785
    @arabtechonologies3785 2 года назад +1

    Keep it up, good tutorial

  • @Monsalvo888
    @Monsalvo888 2 года назад +1

    thanks!! this was super helpful

  • @ghilmanfatih9751
    @ghilmanfatih9751 2 года назад +1

    nice share! thanks, it really helps!

  • @DM-vr8pr
    @DM-vr8pr 3 года назад +1

    Thank you so much! This was very helpful :)

  • @DataLikeQWERTY
    @DataLikeQWERTY 3 года назад +1

    Татьяна, спасибо за видео. Наконец-то нашёл нормальное объяснение про работу с GA.
    Привет из Сибири (Иркутск)
    И сразу вопрос: как получить данные через PageToken? Так как данных более 100'000

    • @probabilistically
      @probabilistically  3 года назад +1

      Иван, спасибо на добром слове. Привет Иркутску!
      С токенами все достаточно просто. Если разбивать запрос по 10 записей на страницу, то это будет третья страница:
      body = {'reportRequests': [{'viewId': your_view_id,
      'dateRanges': [{'startDate': '2021-01-01', 'endDate': '2021-04-30'}],
      'metrics': [{'expression': 'ga:sessions'}],
      'dimensions': [{'name': 'ga:date'},
      {'name': 'ga:country'}],
      'pageToken': '20',
      "pageSize": 10
      }]}
      В ответе же появляется следующее значение: 'nextPageToken': '30', так что можно по нему ориентироваться, закончился ответ или нет.
      Гид девелопера тут.
      developers.google.com/analytics/devguides/reporting/core/v4/rest/v4/reports/batchGet

    • @DataLikeQWERTY
      @DataLikeQWERTY 3 года назад +1

      @@probabilistically С этим я разобрался. Вопрос скорее состоял в том: возможно есть функция, которая извлекает данные, если ей дать на вход reportRequests.
      Я реализовал такое решение, но уверен, что можно лучше, потому что текущее весьма сложно читается и не готово к масштабированию и помещению в функцию.
      Возможно у вас есть наилучшее решение по извлечению данных, если их более 100'000

    • @probabilistically
      @probabilistically  3 года назад

      Иван, ага, начинаю понимать. Извиняюсь, что мои решения скорее всего ламерские, так как моя среда анализ данных, а не инженерия данных или программирование.
      Я бы посмотрела, какой процесс больше всего времени занимает, посмотрела на конечную цель (сохранить данные или анализировать в питоне), и оптимизировала таким образом. Например, если проблема в "склеивании" кусков, то можно экспортивать в SQL, и тогда записи будут просто добавляться в таблицу. Если проблема в конвертации в таблицу, тогда можно попытатся сопимизировать процесс под ваш конкретный ответ сервера. Если проблема в запросе и получении данных, то тут, пожалуй, мало что можно сделать, разве что выборку из записей.
      Удачи!

  • @salivona
    @salivona 2 года назад +1

    Wow thank you a lot!

  • @srinistas
    @srinistas 2 года назад +1

    Thanks Tanya, this is fantastic video. Just need one information,how to fetch more than 1000 rows from GA API. At present, I would see only 1000 records per request

    • @kashifkudalkar
      @kashifkudalkar 2 года назад +1

      use 'pageSize':100000 in the body of the reportRequests. The Analytics Core Reporting API returns a maximum of 100,000 rows per request, no matter how many you ask for.

  • @playkill51
    @playkill51 Год назад

    Hello, can I migrate this same code to GA4? because UA will be discontinued next month

    • @probabilistically
      @probabilistically  Год назад

      It's a similar process for GA4, just a different API. Here is the video: ruclips.net/video/HbxIXEfl-Hs/видео.html

  • @nuusain996
    @nuusain996 3 года назад +1

    Great explanation and thanks for the gthub link. Do you know if its possible to combine these features (such as bounce rate, visits ect..) with demographic features (such as age, sex ect..)?

    • @probabilistically
      @probabilistically  3 года назад

      I am glad it helped. Yes, demo variables can be combined with most of the metrics. Just drop the demographics into the dimension fields and views/sessions/bounce rates into the metrics, and you should be good.

  • @robsonc.4047
    @robsonc.4047 2 года назад

    Why I can't see the 3 column, i can not get de ID VIEW

  • @jose.antoniogonzalezbustos3259
    @jose.antoniogonzalezbustos3259 Год назад +1

    Will this still work for G.A4?

    • @probabilistically
      @probabilistically  Год назад

      GA4 uses a different API called "Google Analytics Data API". You can add both to your project, and use them both from the same service account, though through different packages. Here are the details: ruclips.net/video/HbxIXEfl-Hs/видео.html

  • @himanshudrish
    @himanshudrish Год назад

    Hi, I want to extract Cohort Dimensions but getting the below error can you please help to resolve this issue
    Error:- The request has cohort dimensions without a cohort definition

    • @probabilistically
      @probabilistically  Год назад

      I think you need to define cohort using this definition: developers.google.com/analytics/devguides/reporting/core/v4/rest/v4/reports/batchGet#CohortGroup

  • @ghay3
    @ghay3 2 года назад

    Great video - thanks for posting. I am not sure how to get the view_id - are you able to point out how? Thanks.

    • @probabilistically
      @probabilistically  2 года назад +1

      That's a very good question. It appears that you may be able to list views using the Account Managment API (presuming it's enabled). developers.google.com/analytics/devguides/config/mgmt/v3/mgmtReference/management/profiles/list

    • @ghay3
      @ghay3 2 года назад

      @@probabilistically Thank you

  • @waqitshatasheel5875
    @waqitshatasheel5875 2 года назад

    Thanks for the tutorial. But I can only download 1000 rows. I know there something related to limit or page size, how to I fix it?

    • @probabilistically
      @probabilistically  Год назад

      developers.google.com/analytics/devguides/reporting/core/v4/basics#pagination
      Try using pageSize and pageToken parameters in the request.

    • @waqitshatasheel5875
      @waqitshatasheel5875 Год назад

      @@probabilistically Should I define these token? or its fetched from previous step? Thanks

    • @probabilistically
      @probabilistically  Год назад

      @@waqitshatasheel5875 nextPageToken is part of the response that you get when there are more rows that can be returned.

  • @user-ce7dl3bx5f
    @user-ce7dl3bx5f Год назад

    Hi,
    How can I use OAuth to authenticate instead of creds.json?

    • @probabilistically
      @probabilistically  Год назад

      It looks like you can use 0Auth with Command Line Interface, but not with the client libraries. Maybe you can authenticate an app and then run the code from it. developers.google.com/analytics/devguides/reporting/data/v1/quickstart-cli

  • @jose.antoniogonzalezbustos3259

    Thanks for the tutorial! I have a question though, if i want to get for example, more than the first 10 referrals (50 for example), how can i do it? because the code only allows to view the first 10

    • @probabilistically
      @probabilistically  Год назад +1

      Do you know if this is a restriction on the number of rows by Google or in python?

    • @jose.antoniogonzalezbustos3259
      @jose.antoniogonzalezbustos3259 Год назад

      @@probabilistically in python, because in G.A I can choose for 10, 25, 50 or more rows in the lower right corner

    • @probabilistically
      @probabilistically  Год назад

      @@jose.antoniogonzalezbustos3259 Do you get the full dataset when you export your data into csv?

    • @jose.antoniogonzalezbustos3259
      @jose.antoniogonzalezbustos3259 Год назад

      @@probabilistically yes, you get all of them and thats what i would like to get in my table in python :C

    • @probabilistically
      @probabilistically  Год назад

      @@jose.antoniogonzalezbustos3259 How many rows do you have? Some commands will cut the rows, like df.head() shows 5 rows by default, but you can specify df.head(20). And some of the editors may limit the number of rows in the output.

  • @martamaecka1267
    @martamaecka1267 2 года назад

    Hi, will you do a tutorial like this for the GA4 api (Google Data api)?

    • @probabilistically
      @probabilistically  Год назад

      Here is my GA4 python API video ruclips.net/video/HbxIXEfl-Hs/видео.html

  • @abhinavkale4632
    @abhinavkale4632 3 года назад

    I am not able to see view settings under all website data. Why is that happening and could you please guide me on how to fix it.

    • @abhinavkale4632
      @abhinavkale4632 3 года назад

      I did setup the view_id and used your code to fetch the data.. but the output that I am getting is blank.. This is what I am getting in the output,,,, ___

    • @abhinavkale4632
      @abhinavkale4632 3 года назад

      @@probabilistically ok will try..

  • @oscarakchurin4350
    @oscarakchurin4350 2 года назад

    Видео просто супер! Единственно не нашел ссылку на гит.....

    • @probabilistically
      @probabilistically  2 года назад

      github.com/tanyazyabkina/GoogleAnalyticsReportingAPI_python
      У меня, как ни странно, тоже не открывает описание, если я залогинена. Если открыть видео в окне инкогнито, тогда все работает.

  • @oscarakchurin4350
    @oscarakchurin4350 2 года назад

    Можно заменить viewId на propertyID?

    • @probabilistically
      @probabilistically  2 года назад

      Насколько я знаю, нет. В одно property могут быть много views и система должна знать, какой.

  • @ProgrammingNewbie
    @ProgrammingNewbie Год назад

    Bad Ass!

  • @SahilVerma-nf4zx
    @SahilVerma-nf4zx Год назад

    I am getting this error: Error: time data '202304' does not match format '%Y%m%d' , when using combination of ga:users and ga:yearMonth. Any idea how to resolve this?