@@practicalgcp2780 The Problem statement is something like Let's say we have a VPCSC restricted environment. where Project A is a centralised data sharing project for Bigquery. In that case to establish the communication between centralised project A and other project that are consuming the sharing data and those project for them we are creating exchanges and listing to share the data. what should be the VPCSC Service Perimeter Policies. Example Ingress and egress policies.
it really depends on how you set things up in your org. Typically you may not want to have too many perimeters in the same org, because the overhead maybe too much, one single perimeter for the whole org is also a valid setup, so you can prevent risks from outside of the org, but within the org no whitelisting is required. I haven’t done this for analytics hub, but I believe it’s the same, you need to whitelist both ingress and egress rules as you are trying to get access to data from outside your org.
Thank you for sharing, Richard! I am truly interested in exploring the concept of a Data Clean Room, there is a desire to facilitate data sharing for transformation without the need for data movement processes. So the better ways to do Data Sharing with Analytics Hub is we need to create a new project to deploy the Analytics Hub, this project will centralize the sharing process?
Thank you for the comment! In my opinion, it’s a good model to create a centralised project to create exchanges where you may want to centralise who owns them and who can publish, and consistent naming conventions. So it doesn’t become a mess.
This is a pretty cool breakdown - where do you see the analytics hub configuration sitting? In the data generator project, or in a project of it's own?
Thank you! I am not sure what is the best design, but in my option it would be better to keep the all the exchanges in a single separate project that is managed by the data platform team. That way you can apply governance and privacy control must easily, if you keep them in the source projects, it could still end up with each team doing whatever they like problem and it’s more difficult to monitor as well
Thank you for that. One question - if I'm not mistaken, Analytics Hub won't assist when querying tables located across multiple regions, like the US and EU, without some form of replication. Is that correct?
Actually I think I may have misunderstood the purpose of data-replication. I think this is more created for a primary / replica disaster recovery sort of use case, or data migrations between regions. Not for the ability to query the data on a separate region which I think is what you are trying to achieve.
It makes no difference using authorised views, as authorised view permissions are managed the same way as tables, different to normal views. However, using authorised views has some tradeoffs, a key one being losing metadata such as column descriptions which isn’t great for data consumers. But it does have the advantage if you don’t want to duplicate data models or increase latencies
Thanks a lot, Richard He, for creating this insightful video on Analytics Hub.
Hey, thanks for such detailed video. Just a request - Can you please make a video on Usage metrics in Analytics Hub.
Usage metrics? Do you mean how to find or use those? I don’t recall there’s anything built in for that
Sir Please make video on same with VPCSC
Can you give a bit more detail on what problem you try to solve with VPC SC?
There is something published by our team a while back you might find useful medium.com/@vmo2techteam/how-we-secured-our-data-on-the-cloud-341d4ac394b9
@@practicalgcp2780 The Problem statement is something like
Let's say we have a VPCSC restricted environment. where Project A is a centralised data sharing project for Bigquery. In that case to establish the communication between centralised project A and other project that are consuming the sharing data and those project for them we are creating exchanges and listing to share the data. what should be the VPCSC Service Perimeter Policies. Example Ingress and egress policies.
it really depends on how you set things up in your org. Typically you may not want to have too many perimeters in the same org, because the overhead maybe too much, one single perimeter for the whole org is also a valid setup, so you can prevent risks from outside of the org, but within the org no whitelisting is required.
I haven’t done this for analytics hub, but I believe it’s the same, you need to whitelist both ingress and egress rules as you are trying to get access to data from outside your org.
Thank you for sharing, Richard! I am truly interested in exploring the concept of a Data Clean Room, there is a desire to facilitate data sharing for transformation without the need for data movement processes. So the better ways to do Data Sharing with Analytics Hub is we need to create a new project to deploy the Analytics Hub, this project will centralize the sharing process?
Thank you for the comment! In my opinion, it’s a good model to create a centralised project to create exchanges where you may want to centralise who owns them and who can publish, and consistent naming conventions. So it doesn’t become a mess.
This is a pretty cool breakdown - where do you see the analytics hub configuration sitting? In the data generator project, or in a project of it's own?
Thank you! I am not sure what is the best design, but in my option it would be better to keep the all the exchanges in a single separate project that is managed by the data platform team. That way you can apply governance and privacy control must easily, if you keep them in the source projects, it could still end up with each team doing whatever they like problem and it’s more difficult to monitor as well
Thank you for that. One question - if I'm not mistaken, Analytics Hub won't assist when querying tables located across multiple regions, like the US and EU, without some form of replication. Is that correct?
Hi there, no it won’t. But google just announced dataset replication in preview, check it out here cloud.google.com/bigquery/docs/data-replication
Actually I think I may have misunderstood the purpose of data-replication. I think this is more created for a primary / replica disaster recovery sort of use case, or data migrations between regions. Not for the ability to query the data on a separate region which I think is what you are trying to achieve.
How we can share the authorized view using analytics hub?
It makes no difference using authorised views, as authorised view permissions are managed the same way as tables, different to normal views. However, using authorised views has some tradeoffs, a key one being losing metadata such as column descriptions which isn’t great for data consumers. But it does have the advantage if you don’t want to duplicate data models or increase latencies