**** UPDATES ***** 1. I recommend that you run the code in Google Colab, especially, if you run into library installation issues. There is nothing to install in Colab, and it works very well. 2. I have updated the code to include pagination. Useful if you need to archive your UA data before it gets deleted in July 2024. New GitHub link: github.com/tanyazyabkina/Archive_Google_Analytics_UA_data/tree/main
Thank you very much for sharing! ... I did not find this information anywhere or in Spanish, so I will try to reproduce it and translate it into Spanish if I have some time.
Hi, Your video is very crisp and precise However I wanted to fetch about 6 dimensions along with a some metrics. This list of dimensions that I want to extract is : country,date,devicecategory,hour,channel grouping,page path, landing page path And metrics : sessions ,page views, bounces , entrances , product revenue, quantity,transaction revenue , transactions and product checkouts. When I try to fetch all metrics and only the dimensions - country,date and device category I get all the correct values for the metrics,we are checking against GA web interface,(this also works if I get all metrics and dimensions like - country, date, device category ,hour, channel grouping this works for all view ids except the one with maximum values). And if I try to include the page path and landing page path dimensions the metrics give lesser values as in GA web interface. I have tried checking that if at all there is any sampling,but there isn't any. After that I have also tried using page sessions and page token [here page token takes next page token as a value and page sessions takes 10000 as value]
Hi! Are you having issues with the number of rows? If so, yes you would need to use pagination with a token. There is also sampling available in these reports. Check out this guide: developers.google.com/analytics/devguides/reporting/core/v4/rest/v4/reports/batchGet
Татьяна, спасибо за видео. Наконец-то нашёл нормальное объяснение про работу с GA. Привет из Сибири (Иркутск) И сразу вопрос: как получить данные через PageToken? Так как данных более 100'000
Иван, спасибо на добром слове. Привет Иркутску! С токенами все достаточно просто. Если разбивать запрос по 10 записей на страницу, то это будет третья страница: body = {'reportRequests': [{'viewId': your_view_id, 'dateRanges': [{'startDate': '2021-01-01', 'endDate': '2021-04-30'}], 'metrics': [{'expression': 'ga:sessions'}], 'dimensions': [{'name': 'ga:date'}, {'name': 'ga:country'}], 'pageToken': '20', "pageSize": 10 }]} В ответе же появляется следующее значение: 'nextPageToken': '30', так что можно по нему ориентироваться, закончился ответ или нет. Гид девелопера тут. developers.google.com/analytics/devguides/reporting/core/v4/rest/v4/reports/batchGet
@@probabilistically С этим я разобрался. Вопрос скорее состоял в том: возможно есть функция, которая извлекает данные, если ей дать на вход reportRequests. Я реализовал такое решение, но уверен, что можно лучше, потому что текущее весьма сложно читается и не готово к масштабированию и помещению в функцию. Возможно у вас есть наилучшее решение по извлечению данных, если их более 100'000
Иван, ага, начинаю понимать. Извиняюсь, что мои решения скорее всего ламерские, так как моя среда анализ данных, а не инженерия данных или программирование. Я бы посмотрела, какой процесс больше всего времени занимает, посмотрела на конечную цель (сохранить данные или анализировать в питоне), и оптимизировала таким образом. Например, если проблема в "склеивании" кусков, то можно экспортивать в SQL, и тогда записи будут просто добавляться в таблицу. Если проблема в конвертации в таблицу, тогда можно попытатся сопимизировать процесс под ваш конкретный ответ сервера. Если проблема в запросе и получении данных, то тут, пожалуй, мало что можно сделать, разве что выборку из записей. Удачи!
Thanks Tanya, this is fantastic video. Just need one information,how to fetch more than 1000 rows from GA API. At present, I would see only 1000 records per request
use 'pageSize':100000 in the body of the reportRequests. The Analytics Core Reporting API returns a maximum of 100,000 rows per request, no matter how many you ask for.
Great explanation and thanks for the gthub link. Do you know if its possible to combine these features (such as bounce rate, visits ect..) with demographic features (such as age, sex ect..)?
I am glad it helped. Yes, demo variables can be combined with most of the metrics. Just drop the demographics into the dimension fields and views/sessions/bounce rates into the metrics, and you should be good.
GA4 uses a different API called "Google Analytics Data API". You can add both to your project, and use them both from the same service account, though through different packages. Here are the details: ruclips.net/video/HbxIXEfl-Hs/видео.html
Hi, I want to extract Cohort Dimensions but getting the below error can you please help to resolve this issue Error:- The request has cohort dimensions without a cohort definition
I think you need to define cohort using this definition: developers.google.com/analytics/devguides/reporting/core/v4/rest/v4/reports/batchGet#CohortGroup
That's a very good question. It appears that you may be able to list views using the Account Managment API (presuming it's enabled). developers.google.com/analytics/devguides/config/mgmt/v3/mgmtReference/management/profiles/list
It looks like you can use 0Auth with Command Line Interface, but not with the client libraries. Maybe you can authenticate an app and then run the code from it. developers.google.com/analytics/devguides/reporting/data/v1/quickstart-cli
Thanks for the tutorial! I have a question though, if i want to get for example, more than the first 10 referrals (50 for example), how can i do it? because the code only allows to view the first 10
@@jose.antoniogonzalezbustos3259 How many rows do you have? Some commands will cut the rows, like df.head() shows 5 rows by default, but you can specify df.head(20). And some of the editors may limit the number of rows in the output.
I did setup the view_id and used your code to fetch the data.. but the output that I am getting is blank.. This is what I am getting in the output,,,, ___
github.com/tanyazyabkina/GoogleAnalyticsReportingAPI_python У меня, как ни странно, тоже не открывает описание, если я залогинена. Если открыть видео в окне инкогнито, тогда все работает.
I am getting this error: Error: time data '202304' does not match format '%Y%m%d' , when using combination of ga:users and ga:yearMonth. Any idea how to resolve this?
**** UPDATES *****
1. I recommend that you run the code in Google Colab, especially, if you run into library installation issues. There is nothing to install in Colab, and it works very well.
2. I have updated the code to include pagination. Useful if you need to archive your UA data before it gets deleted in July 2024. New GitHub link: github.com/tanyazyabkina/Archive_Google_Analytics_UA_data/tree/main
It’s a shame informative channels like these don’t have more views,likes and subs thanks for your work 🙏
Thank you very much, very informative video. I can see this channel growing exponentially soon as more prople will be interested in data!
Wow, you've just saved me a dozen hours' work. Thank you so much!
Hello, just want to say thank you because you saved me a lot of time with this code!!
Thank you very much from Japan. It's very useful.
Thank you very much for sharing! ... I did not find this information anywhere or in Spanish, so I will try to reproduce it and translate it into Spanish if I have some time.
This video and the guide code helped a lot, thanks a lot Tanya C:
Very interesting tutorial indeed! Thank's for sharing!
Hi,
Your video is very crisp and precise
However I wanted to fetch about 6 dimensions along with a some metrics.
This list of dimensions that I want to extract is : country,date,devicecategory,hour,channel grouping,page path, landing page path
And metrics : sessions ,page views, bounces , entrances , product revenue, quantity,transaction revenue , transactions and product checkouts.
When I try to fetch all metrics and only the dimensions - country,date and device category I get all the correct values for the metrics,we are checking against GA web interface,(this also works if I get all metrics and dimensions like - country, date, device category ,hour, channel grouping this works for all view ids except the one with maximum values).
And if I try to include the page path and landing page path dimensions the metrics give lesser values as in GA web interface.
I have tried checking that if at all there is any sampling,but there isn't any.
After that I have also tried using page sessions and page token [here page token takes next page token as a value and page sessions takes 10000 as value]
Hi! Are you having issues with the number of rows? If so, yes you would need to use pagination with a token. There is also sampling available in these reports.
Check out this guide:
developers.google.com/analytics/devguides/reporting/core/v4/rest/v4/reports/batchGet
This video helped me a lot, I just wanted to say thank you!
Thank you, for you knowledge!
Keep it up, good tutorial
thanks!! this was super helpful
nice share! thanks, it really helps!
Thank you so much! This was very helpful :)
Glad you found it helpful!
Татьяна, спасибо за видео. Наконец-то нашёл нормальное объяснение про работу с GA.
Привет из Сибири (Иркутск)
И сразу вопрос: как получить данные через PageToken? Так как данных более 100'000
Иван, спасибо на добром слове. Привет Иркутску!
С токенами все достаточно просто. Если разбивать запрос по 10 записей на страницу, то это будет третья страница:
body = {'reportRequests': [{'viewId': your_view_id,
'dateRanges': [{'startDate': '2021-01-01', 'endDate': '2021-04-30'}],
'metrics': [{'expression': 'ga:sessions'}],
'dimensions': [{'name': 'ga:date'},
{'name': 'ga:country'}],
'pageToken': '20',
"pageSize": 10
}]}
В ответе же появляется следующее значение: 'nextPageToken': '30', так что можно по нему ориентироваться, закончился ответ или нет.
Гид девелопера тут.
developers.google.com/analytics/devguides/reporting/core/v4/rest/v4/reports/batchGet
@@probabilistically С этим я разобрался. Вопрос скорее состоял в том: возможно есть функция, которая извлекает данные, если ей дать на вход reportRequests.
Я реализовал такое решение, но уверен, что можно лучше, потому что текущее весьма сложно читается и не готово к масштабированию и помещению в функцию.
Возможно у вас есть наилучшее решение по извлечению данных, если их более 100'000
Иван, ага, начинаю понимать. Извиняюсь, что мои решения скорее всего ламерские, так как моя среда анализ данных, а не инженерия данных или программирование.
Я бы посмотрела, какой процесс больше всего времени занимает, посмотрела на конечную цель (сохранить данные или анализировать в питоне), и оптимизировала таким образом. Например, если проблема в "склеивании" кусков, то можно экспортивать в SQL, и тогда записи будут просто добавляться в таблицу. Если проблема в конвертации в таблицу, тогда можно попытатся сопимизировать процесс под ваш конкретный ответ сервера. Если проблема в запросе и получении данных, то тут, пожалуй, мало что можно сделать, разве что выборку из записей.
Удачи!
Wow thank you a lot!
Thanks Tanya, this is fantastic video. Just need one information,how to fetch more than 1000 rows from GA API. At present, I would see only 1000 records per request
use 'pageSize':100000 in the body of the reportRequests. The Analytics Core Reporting API returns a maximum of 100,000 rows per request, no matter how many you ask for.
Hello, can I migrate this same code to GA4? because UA will be discontinued next month
It's a similar process for GA4, just a different API. Here is the video: ruclips.net/video/HbxIXEfl-Hs/видео.html
Great explanation and thanks for the gthub link. Do you know if its possible to combine these features (such as bounce rate, visits ect..) with demographic features (such as age, sex ect..)?
I am glad it helped. Yes, demo variables can be combined with most of the metrics. Just drop the demographics into the dimension fields and views/sessions/bounce rates into the metrics, and you should be good.
Why I can't see the 3 column, i can not get de ID VIEW
Will this still work for G.A4?
GA4 uses a different API called "Google Analytics Data API". You can add both to your project, and use them both from the same service account, though through different packages. Here are the details: ruclips.net/video/HbxIXEfl-Hs/видео.html
Hi, I want to extract Cohort Dimensions but getting the below error can you please help to resolve this issue
Error:- The request has cohort dimensions without a cohort definition
I think you need to define cohort using this definition: developers.google.com/analytics/devguides/reporting/core/v4/rest/v4/reports/batchGet#CohortGroup
Great video - thanks for posting. I am not sure how to get the view_id - are you able to point out how? Thanks.
That's a very good question. It appears that you may be able to list views using the Account Managment API (presuming it's enabled). developers.google.com/analytics/devguides/config/mgmt/v3/mgmtReference/management/profiles/list
@@probabilistically Thank you
Thanks for the tutorial. But I can only download 1000 rows. I know there something related to limit or page size, how to I fix it?
developers.google.com/analytics/devguides/reporting/core/v4/basics#pagination
Try using pageSize and pageToken parameters in the request.
@@probabilistically Should I define these token? or its fetched from previous step? Thanks
@@waqitshatasheel5875 nextPageToken is part of the response that you get when there are more rows that can be returned.
Hi,
How can I use OAuth to authenticate instead of creds.json?
It looks like you can use 0Auth with Command Line Interface, but not with the client libraries. Maybe you can authenticate an app and then run the code from it. developers.google.com/analytics/devguides/reporting/data/v1/quickstart-cli
Thanks for the tutorial! I have a question though, if i want to get for example, more than the first 10 referrals (50 for example), how can i do it? because the code only allows to view the first 10
Do you know if this is a restriction on the number of rows by Google or in python?
@@probabilistically in python, because in G.A I can choose for 10, 25, 50 or more rows in the lower right corner
@@jose.antoniogonzalezbustos3259 Do you get the full dataset when you export your data into csv?
@@probabilistically yes, you get all of them and thats what i would like to get in my table in python :C
@@jose.antoniogonzalezbustos3259 How many rows do you have? Some commands will cut the rows, like df.head() shows 5 rows by default, but you can specify df.head(20). And some of the editors may limit the number of rows in the output.
Hi, will you do a tutorial like this for the GA4 api (Google Data api)?
Here is my GA4 python API video ruclips.net/video/HbxIXEfl-Hs/видео.html
I am not able to see view settings under all website data. Why is that happening and could you please guide me on how to fix it.
I did setup the view_id and used your code to fetch the data.. but the output that I am getting is blank.. This is what I am getting in the output,,,, ___
@@probabilistically ok will try..
Видео просто супер! Единственно не нашел ссылку на гит.....
github.com/tanyazyabkina/GoogleAnalyticsReportingAPI_python
У меня, как ни странно, тоже не открывает описание, если я залогинена. Если открыть видео в окне инкогнито, тогда все работает.
Можно заменить viewId на propertyID?
Насколько я знаю, нет. В одно property могут быть много views и система должна знать, какой.
Bad Ass!
I am getting this error: Error: time data '202304' does not match format '%Y%m%d' , when using combination of ga:users and ga:yearMonth. Any idea how to resolve this?
You need to use the exact date in '2023-01-01' format.