0:00 Welcome to Data Integrity 4:22 Why Data Integrity is so Important * Data integrity: accuracy, completeness, consistency, and trustworthiness of data throughout its lifecycle * Data can be compromised: Data replication, Data transfer, Data manipulation,... 7:15 Balancing Objective and Integrity 10:31 Dealing with Insufficient Data Types of Insufficient Data: * Only one source >> missing the overall, also not enough data. * Data keeps updating. * Outdated data. * Geographically-limited data. Dealing: * Find more data sources. * Collect more data if time allows. * Talk to stakeholders, and adjust the objective. * Look for new data set. 14:22 The Importance of Sample Size * If collecting the whole population is too time-consuming and expensive. * So collecting a sample size of the population could represent the whole: effectively time and cost. 17:51 Using Statistical Power * Perform on a "larger" sample size * Hypothesis testing: survey/experiment to get the meaningful results * And using "Confidence level" to determine the results of an experiment from hypothesis testing. * At least 0.8 or 80% confidence level proven results the hypothesis testing. 22:51 Determine Sample Sizes Through Population and Purpose * Focus on the big picture: get a proper sample size make the sense, and lead valid and useful. * To find a sample size, need to know: the population size, confidence level, and margin of error. >> Note: confidence level and margin of error do have to not add up to 100%, they are not dependent on each other. 27:40 Reliable Data
when it comes to margin of error isn't it that when you are doing two way testing as in the case of the 4 day work week the error is split and not 50% or 70% margin? so margin of error OR alpha = 0.1 then 0.05 would be for the left side of the confidence interval and 0.05 for the right side? this would make the margin actually 55% or 65% right?
Thank you for this awesome tut. Could you please have another dedicated video for Sampling - detailed? If you've such a tut already, can you please share the link? Thanks in advance.
I'm a recent graduate Mechatronics engineer. I'm interested to work with Google. I applied so many jobs at Google. But I didn't get even a single opportunity.. I'm looking for a opportunity.
0:00 Welcome to Data Integrity
4:22 Why Data Integrity is so Important
* Data integrity: accuracy, completeness, consistency, and trustworthiness of data throughout its lifecycle
* Data can be compromised: Data replication, Data transfer, Data manipulation,...
7:15 Balancing Objective and Integrity
10:31 Dealing with Insufficient Data
Types of Insufficient Data:
* Only one source >> missing the overall, also not enough data.
* Data keeps updating.
* Outdated data.
* Geographically-limited data.
Dealing:
* Find more data sources.
* Collect more data if time allows.
* Talk to stakeholders, and adjust the objective.
* Look for new data set.
14:22 The Importance of Sample Size
* If collecting the whole population is too time-consuming and expensive.
* So collecting a sample size of the population could represent the whole: effectively time and cost.
17:51 Using Statistical Power
* Perform on a "larger" sample size
* Hypothesis testing: survey/experiment to get the meaningful results
* And using "Confidence level" to determine the results of an experiment from hypothesis testing.
* At least 0.8 or 80% confidence level proven results the hypothesis testing.
22:51 Determine Sample Sizes Through Population and Purpose
* Focus on the big picture: get a proper sample size make the sense, and lead valid and useful.
* To find a sample size, need to know: the population size, confidence level, and margin of error.
>> Note: confidence level and margin of error do have to not add up to 100%, they are not dependent on each other.
27:40 Reliable Data
Excellent video. Using this to help clean up a massive sales lead list.
I just love her. The best explanation!
So valuable tutorial I belive i will a master in data analysis
when it comes to margin of error isn't it that when you are doing two way testing as in the case of the 4 day work week the error is split and not 50% or 70% margin? so margin of error OR alpha = 0.1 then 0.05 would be for the left side of the confidence interval and 0.05 for the right side? this would make the margin actually 55% or 65% right?
Great video and information.
Thank you for this awesome tut. Could you please have another dedicated video for Sampling - detailed? If you've such a tut already, can you please share the link? Thanks in advance.
Thanks for this insight
Thank you. This was insightful.
Thank you so much ❤️
How to clean my data
I'm a recent graduate Mechatronics engineer. I'm interested to work with Google. I applied so many jobs at Google. But I didn't get even a single opportunity.. I'm looking for a opportunity.
Lol, did you find one?
So much theory in here- it is making me nauseous.
I don't mean to be rude, but she looks like a cute Chines I.A. Robot