Видео 230
Просмотров 2 713 625

Amazon Aurora - Cloud Native SQL | Distributed Systems Deep Dives With Ex-Google SWE

27:20

Facebook TAO - Graphs at Scale | Distributed Systems Deep Dives With Ex-Google SWE

32:38

Happy New Year!

6:06

Databricks Photon - Fun With Vectors | Distributed Systems Deep Dives With Ex-Google SWE

33:31

Borg - Get The Most Out Of Your Resources | Distributed Systems Deep Dives With Ex-Google SWE

28:00

Flink - *Exactly* Once Processing? | Distributed Systems Deep Dives With Ex-Google SWE

26:08

TikTok Monolith - Online Recommendation Systems | Distributed Systems Deep Dives With Ex-Google SWE

arxiv.org/pdf/2209.07663
Them dense features got me going cuckoo babygirl

Видео

Amazon Aurora - Cloud Native SQL | Distributed Systems Deep Dives With Ex-Google SWE

27:20

Amazon Aurora - Cloud Native SQL | Distributed Systems Deep Dives With Ex-Google SWE

Просмотров 2,1 тыс.День назад

pages.cs.wisc.edu/~yxy/cs764-f20/papers/aurora-sigmod-17.pdf My durability point seems to be about 3 inches, otherwise I may be in for an emergency

Facebook TAO - Graphs at Scale | Distributed Systems Deep Dives With Ex-Google SWE

32:38

Facebook TAO - Graphs at Scale | Distributed Systems Deep Dives With Ex-Google SWE

Просмотров 2,9 тыс.14 дней назад

www.usenix.org/system/files/conference/atc13/atc13-bronson.pdf I'll let it be known I didn't pee myself (I did poop myself though)

6:06

Happy New Year!

Просмотров 4 тыс.21 день назад

Expect these types of posts from me (I wouldn't be surprised if this one exists) Dear Linkedin Audience, While many of you may be distraught by Trump's election, let's not neglect how there's something to be learned here about the software engineering job application process. You see, when Trump lost the election in 2020, he didn't just give up, throw in the towel, and call it a day. He fought ...

Databricks Photon - Fun With Vectors | Distributed Systems Deep Dives With Ex-Google SWE

33:31

Databricks Photon - Fun With Vectors | Distributed Systems Deep Dives With Ex-Google SWE

Просмотров 2,6 тыс.Месяц назад

people.eecs.berkeley.edu/~matei/papers/2022/sigmod_photon.pdf The ladies keep dropping hints at me out here, telling me they're restricted

Borg - Get The Most Out Of Your Resources | Distributed Systems Deep Dives With Ex-Google SWE

28:00

Borg - Get The Most Out Of Your Resources | Distributed Systems Deep Dives With Ex-Google SWE

Просмотров 2,6 тыс.Месяц назад

static.googleusercontent.com/media/research.google.com/en//pubs/archive/43438.pdf Lemme reclaim some of those resources babygirl

Flink - *Exactly* Once Processing? | Distributed Systems Deep Dives With Ex-Google SWE

26:08

Flink - *Exactly* Once Processing? | Distributed Systems Deep Dives With Ex-Google SWE

Просмотров 5 тыс.Месяц назад

Remember gentlemen, they said *exactly* once. I better not see you guys going back there many times Sources: asterios.katsifodimos.com/assets/publications/flink-deb.pdf ruclips.net/video/hoLeQjoGBkQ/видео.html&ab_channel=FlinkForward

Debezium - Change Data Capture Made Easy | Distributed Systems Deep Dives With Ex-Google SWE

15:02

Debezium - Change Data Capture Made Easy | Distributed Systems Deep Dives With Ex-Google SWE

Просмотров 3,9 тыс.Месяц назад

I wish Debezium would also help me tell all my friends every time I drop a log Sources: ruclips.net/video/QYbXDp4Vu-8/видео.html&ab_channel=RedHatDeveloper debezium.io/documentation/reference/stable/index.html

Apache Iceberg - More Than A Table Format | Distributed Systems Deep Dives With Ex-Google SWE

22:37

Apache Iceberg - More Than A Table Format | Distributed Systems Deep Dives With Ex-Google SWE

Просмотров 2,6 тыс.2 месяца назад

Only time I've used copy on write semantics in my life is when I DM the same thing to 50 different girls on instagram Sources: ruclips.net/video/6tjSVXpHrE8/видео.html&ab_channel=IBMTechnology ruclips.net/video/mf8Hb0coI6o/видео.html&ab_channel=Alluxio ruclips.net/video/p24GiqQaA1U/видео.html&ab_channel=Dremio ruclips.net/video/cI9zu5Rk_bQ/видео.html&ab_channel=Dremio ruclips.net/video/Sguvhvwn...

Apache Arrow - A Game Changer? | Distributed Systems Deep Dives With Ex-Google SWE

18:28

Apache Arrow - A Game Changer? | Distributed Systems Deep Dives With Ex-Google SWE

Просмотров 3,9 тыс.2 месяца назад

I used many sources but I found these the most useful: ruclips.net/video/R4BIXbfKBtk/видео.html&ab_channel=Dremio ruclips.net/video/OLsXlKb_XRQ/видео.html&ab_channel=Databricks arrow.apache.org/faq/ arrow.apache.org/docs/python/memory.html#on-disk-and-memory-mapped-files ursalabs.org/blog/2020-feather-v2/ I wish the ladies were as capable of processing my data as two arrow native servers commun...

Snowflake - Power With No Tuning | Distributed Systems Deep Dives With Ex-Google SWE

48:57

Snowflake - Power With No Tuning | Distributed Systems Deep Dives With Ex-Google SWE

Просмотров 3,4 тыс.2 месяца назад

event.cwi.nl/lsde/papers/p215-dageville-snowflake.pdf justinjaffray.com/query-engines-push-vs.-pull/ Snowflakes for my paper, snowflake is my personality, snowflakes at the DJ show, they're everywhere really

Spark - Fault Tolerance Made Easy | Distributed Systems Deep Dives With Ex-Google SWE

27:01

Spark - Fault Tolerance Made Easy | Distributed Systems Deep Dives With Ex-Google SWE

Просмотров 3,9 тыс.2 месяца назад

people.csail.mit.edu/matei/papers/2012/nsdi_spark.pdf The last time I felt any sort of spark in my life was the one I felt eating taco bell at 2am on a sunday

ZooKeeper - Better Than Chubby? | Distributed Systems Deep Dives With Ex-Google SWE

32:30

ZooKeeper - Better Than Chubby? | Distributed Systems Deep Dives With Ex-Google SWE

Просмотров 3,9 тыс.2 месяца назад

www.usenix.org/legacy/event/atc10/tech/full_papers/Hunt.pdf I have a slight preference towards chubby since I'm chubby myself

Kafka - Perfect For Logs? | Distributed Systems Deep Dives With Ex-Google SWE

19:45

Kafka - Perfect For Logs? | Distributed Systems Deep Dives With Ex-Google SWE

Просмотров 7 тыс.3 месяца назад

notes.stephenholiday.com/Kafka.pdf There's no way Kafka is achieving better log throughput than my toilet though

Mesa - Data Warehousing Done Right | Distributed Systems Deep Dives With Ex-Google SWE

27:37

Mesa - Data Warehousing Done Right | Distributed Systems Deep Dives With Ex-Google SWE

Просмотров 2,3 тыс.3 месяца назад

static.googleusercontent.com/media/research.google.com/en//pubs/archive/42851.pdf I'm also shooting out a bunch of data every 5 minutes or so, however unlike mesa no one seems interested in it

Photon - Exactly Once Stream Processing | Distributed Systems Deep Dives With Ex-Google SWE

29:29

Photon - Exactly Once Stream Processing | Distributed Systems Deep Dives With Ex-Google SWE

Просмотров 2,7 тыс.3 месяца назад

Photon - Exactly Once Stream Processing | Distributed Systems Deep Dives With Ex-Google SWE

Spanner - The Perfect Database? | Distributed Systems Deep Dives With Ex-Google SWE

29:41

Spanner - The Perfect Database? | Distributed Systems Deep Dives With Ex-Google SWE

Просмотров 3,5 тыс.3 месяца назад

Spanner - The Perfect Database? | Distributed Systems Deep Dives With Ex-Google SWE

1:11:00

Systems Design in an Hour

Просмотров 44 тыс.4 месяца назад

Systems Design in an Hour

Megastore - Paxos ... But Better? | Distributed Systems Deep Dives With Ex-Google SWE

31:49

Megastore - Paxos ... But Better? | Distributed Systems Deep Dives With Ex-Google SWE

Просмотров 2,4 тыс.4 месяца назад

Megastore - Paxos ... But Better? | Distributed Systems Deep Dives With Ex-Google SWE

Percolator - Two Phase Commit In Practice | Distributed Systems Deep Dives With Ex-Google SWE

33:27

Percolator - Two Phase Commit In Practice | Distributed Systems Deep Dives With Ex-Google SWE

Просмотров 2,8 тыс.4 месяца назад

Percolator - Two Phase Commit In Practice | Distributed Systems Deep Dives With Ex-Google SWE

Dremel - Columns Are Better | Distributed Systems Deep Dives With Ex-Google SWE

34:42

Dremel - Columns Are Better | Distributed Systems Deep Dives With Ex-Google SWE

Просмотров 2,3 тыс.4 месяца назад

Dremel - Columns Are Better | Distributed Systems Deep Dives With Ex-Google SWE

Google SSO - Strong Consistency in Practice | Distributed Systems Deep Dives With Ex-Google SWE

16:08

Google SSO - Strong Consistency in Practice | Distributed Systems Deep Dives With Ex-Google SWE

Просмотров 4 тыс.4 месяца назад

Google SSO - Strong Consistency in Practice | Distributed Systems Deep Dives With Ex-Google SWE

BigTable - One Database to Rule Them All?. | Distributed Systems Deep Dives With Ex-Google SWE

38:38

BigTable - One Database to Rule Them All?. | Distributed Systems Deep Dives With Ex-Google SWE

Просмотров 3,1 тыс.5 месяцев назад

BigTable - One Database to Rule Them All?. | Distributed Systems Deep Dives With Ex-Google SWE

Google File System (GFS) - It's Ok To Fail | Distributed Systems Deep Dives With Ex-Google SWE

46:04

Google File System (GFS) - It's Ok To Fail | Distributed Systems Deep Dives With Ex-Google SWE

Просмотров 6 тыс.5 месяцев назад

Google File System (GFS) - It's Ok To Fail | Distributed Systems Deep Dives With Ex-Google SWE

Chubby - Eventual Consistency Is Too Hard... | Distributed Systems Deep Dives With Ex-Google SWE

45:49

Chubby - Eventual Consistency Is Too Hard... | Distributed Systems Deep Dives With Ex-Google SWE

Просмотров 4,8 тыс.5 месяцев назад

Chubby - Eventual Consistency Is Too Hard... | Distributed Systems Deep Dives With Ex-Google SWE

MapReduce - Google Thinks You're Bad At Coding | Distributed Systems Deep Dives With Ex-Google SWE

35:06

MapReduce - Google Thinks You're Bad At Coding | Distributed Systems Deep Dives With Ex-Google SWE

Просмотров 6 тыс.5 месяцев назад

MapReduce - Google Thinks You're Bad At Coding | Distributed Systems Deep Dives With Ex-Google SWE

Dynamo - Why Amazon Ditched SQL | Distributed Systems Deep Dives With Ex-Google SWE

48:51

Dynamo - Why Amazon Ditched SQL | Distributed Systems Deep Dives With Ex-Google SWE

Просмотров 17 тыс.6 месяцев назад

Dynamo - Why Amazon Ditched SQL | Distributed Systems Deep Dives With Ex-Google SWE

8:48

time for a change

Просмотров 15 тыс.6 месяцев назад

time for a change

31: Distributed Priority Queue | Systems Design Interview Questions With Ex-Google SWE

20:16

31: Distributed Priority Queue | Systems Design Interview Questions With Ex-Google SWE

Просмотров 10 тыс.6 месяцев назад

31: Distributed Priority Queue | Systems Design Interview Questions With Ex-Google SWE

30: LinkedIn Mutual Connection Search | Systems Design Interview Questions With Ex-Google SWE

25:32

30: LinkedIn Mutual Connection Search | Systems Design Interview Questions With Ex-Google SWE

Просмотров 6 тыс.6 месяцев назад

30: LinkedIn Mutual Connection Search | Systems Design Interview Questions With Ex-Google SWE

@amitmannur8743 8 минут назад
all your videos are awesome, This one is little fast run or complicated or felt something is broken . Would be great if you can revise after 1 yr. something you might have gotten more info during this period, obviously NO LIFE>
@pikacoins Час назад
this would have been really useful last night
@mohd.tahauddin9001 5 часов назад
Excellent video as always Some insights: In most cases single leader replication and leader only read is sufficient. For example, Kafka by default uses single leader replication and leader read, and despite this it offers fantastic throughput. Tests have shown that Kafka can offer up to 70k reads / second, or about 4 million reads per minute! So, in most cases, the benefits of strong consistency from single leader setups will outweigh the read performance advantage of replica reads. I can imagine unless you are dealing with software like Discord or WhatsApp, in most other cases just go with single leader setup. This is something that can be clarified during interviews, to show the interviewer that you know your stuff, but also have the technical sense to simplify design by favoring strong consistency.
@TechWithDozie 8 часов назад
Jordan with no life is out here saving lives, and had to do it while looking awesome because the ladies like it that way. Thanks Champ.
@atabhatti6010 9 часов назад
Is there a reason not to use Redis Cluster and rely on it doing the sharding and replicating?
@atabhatti6010 10 часов назад
@~14:27 Jordan starts talking about the need to do range queries and the need for predicate locks, hash indexes and whether my interviewer will let me get away with that. Instead she may ask for an on-disk database. This all moved a little fast. Could someone please explain this a little bit more?
@atabhatti6010 10 часов назад
at ~9:30 you say that since we are storing ranges on each db, and in case of each collision we can just probe for the next n + 1 hash on the same partition, "this in theory should increase latency". Didn't you meant to say "...should improve latency"?
@SwarajRoy-lv3me 19 часов назад
Unmatched content brother! Bless you. Keep enlightening those interested. 🙏🏾
@panigrahisoumya 20 часов назад
Awesome video Jordan. So much to learn here. God knowns how the Route Calculation is done in India. Here the transition probability is 100% between any two continuous locations. 😂
@jordanhasnolife5163 13 часов назад
Hahaha I've heard driving there gets... interesting
@abdullahmuhammadmoosa7170 23 часа назад
When you mention that the reader schema will ignore net worth, because its not on the reader schema, do you mean like ignore the valus/ not fetch or like dont care what the values would be? What i mean is, if the value is ignored, then all the data of server A will not be fetched right? We will only get the data which are common between server A and B?
@jordanhasnolife5163 13 часов назад
It'll just deserialize the object without the field net worth in it, it'll still fetch the bytes for that value though
@heyyo6633 23 часа назад
one thing i find interesting is that this uber matching problem is also bounded by how fast rider/driver can physically move. for example rider is likely by foot so if location update freq is , say, every 60 second, that's at most 500 ft for each update and it's very unlikely this person can walk from one partition to another, same with driver, if drives at 100mph(and not caught by cops), that's at most 10 miles moved within 5min. i understand this is an extreme example but i also don't think time square would ever warrant it's own partition, the reason being it's still bounded by physics, see this silly example: Scenario: 100k people calling ride at the same time and matched with 100k drivers. Why it's not true because 1. you can't possibly fit 100k cars in that space, so you have to search within a bigger geo region anyway, and 2. you can't have 100k people calling ride simultaneously because times square is small, you can only fit like 150k people max there. So it would be funny if all of them decide to take out their phone and decide to call a ride at the same time. Unless we humans can transcend physics, i feel lucky for those SREs at Uber since their server load is bound by physics haha
@neerajkulkarni6506 День назад
Incredible series! By far the best on youtube. Looking forward to going through the interview questions now.
@karanbirsingh3897 День назад
Hi Jordan, I feel using Write ahead log for ordering transactions can cause issues since only writes (and not reads) would reach the write ahead log. I have added a broader list of questions around this in my other comment on this same video: ruclips.net/video/Tgpa9TrxsfU/видео.html&lc=UgxcOA38-dmUMdvCKQF4AaABAg&si=RHwLnmnEiYSPlBJj Please go through it when you get some time and let me know your thoughts. Thank you!
@Secret4us День назад
ISO8583 would be neat to mention at the point you mention Thrift and Protobufs. Very similar approach.
@Secret4us День назад
JSON : and it uses the ',' to delimit fields which could occur inside of a lot of data, but uses colon to delimit between key and value; and the type of collection is hard to distinguish between; etc.
@muskanmall4401 День назад
that is soo detailed, i had to take a break from learning about it for a few mins!!!
@business_central День назад
Can someone tell me all of his Collabs? Only know the one with Neetcode and that one was hilarious
@jordanhasnolife5163 День назад
Eh pretty much just the neetcode was satirical, others were Kun Chen, Jordan Cutler, Stefan Mai
@Secret4us День назад
Great rundown, thanks
@nareshgb1 День назад
leader election: if the candidate node is more upto date than local log, and its not voted for any other candidate, it votes for the candidate - > shouldn;t it be if the candidate log is NOT as upto date as local log, it will vote for the candidate? if the logs are the same point, it should still vote for the candidate right?
@jordanhasnolife5163 День назад
If I'm any old follower node and I see a candidate with more writes than me, it means I'm behind, which is possible. If I vote for a candidate with a log less up to date than mine, we may end up electing a stale replica as leader.
@heyyo6633 2 дня назад
one thing to add is to pre-compute a subset of popular time ranges, I''m sure 90% of the folks would just be reading the billboard of yesterday/last week/YTD/etc. So these should take advantage of the power of batch compute + cache another thing is I doubt many would query with granularity of hour, the older the time ranges the less they care about granularity. So from UX POV we should restrict users to query by hour/minutes if it's over, say, 7 days. This way, we can move the columnar stored data into an OLAP db such as Pinot, which can make exact query a lot faster than say running ad-hoc Spark jobs against Parquet files or Iceberg tables.
@jordanhasnolife5163 День назад
1) Agree 2) Agree again, but one of the functional requirements of the problem for me was to allow doing any query - at that point you can use whatever data lakehouse you want to answer it.
@cristianouzumaki2455 2 дня назад
Jordan gives so many hearts to comments, we're starting to wonder if his fictional gf is feeling a bit of jealousy. Send me one too bro
@aditya-garg 2 дня назад
Would there be a queue between the streamer and encoder?
@jordanhasnolife5163 День назад
I don't believe so! This all kinda needs to be done synchronously
@abdullahmuhammadmoosa7170 2 дня назад
Hey how are you? Great video! You explained everything pretty easily. One confusion I have htough: You mentioned to store each rows in LSM tree to overcome the downsides of column oriented storage... Here, do we store the values each column into a different LSM tree?( Different LSM tree for different columns)
@jordanhasnolife5163 День назад
I'd think so! Or the same one but still many different SSTables
@Secret4us 2 дня назад
I'm glad those sauce bottles behind you are all still full.
@jordanhasnolife5163 День назад
I'm full of sauce brother
@Secret4us 2 дня назад
No offense to them, but I'm at rewriting their VDL definition as such, "The DB can further constrain LSNs which are free from truncation by tagging some as consistency point LSNs or CPLs. The Volume Durable Point LSN will be the CPL that is the less than or equal to the VCL. (CPLs constituting complete transactions it seems)."
@Secret4us 8 часов назад
Lol, is this around and about one of the grievous typos of which you speak?
@Technoyote 3 дня назад
We are migrating some applications to Spanner at work. This gave me a much better understanding of why it's able to do what it does. Thank you!
@mohd.tahauddin9001 3 дня назад
Great video. Small correction at the 10:20. You said leaderless replication improves write throughput. Actually: -Leaderless replication trades write performance and consistency for improved availability. -Multileader replication trades consistency for increased write performance and availability. -Single leader replication trades write performance and availability for improved consistency This is because in leaderless replication, every node in a quorum is read and written from. This is like single leader replication since all servers are being hit for each request, so doesn't increase write throughput by letting different read/writes hit different servers. Often times, deployments will mix and match leaderless replication with multileader replication. In this case, you have different quorums for different sections of your application - you hinted upon this in sloppy quorum (profile quorum is different from message quorum)
@jordanhasnolife5163 3 дня назад
Yeah agreed with everything said, nice catch
@Secret4us 3 дня назад
So, you're saying master leases increase read throughput by always having a master to read from even though a prior master has been separated from the network?
@jordanhasnolife5163 3 дня назад
Master leases increase read throughput because the master can serve reads itself while it has the lease without requiring a quorum read.
@SushilKumar-rv9eo 3 дня назад
Hey jordan, in which case we used region wise sharding? like the country specific requests should go to the local servers right? and we might be sharding dbs on region basis
@jordanhasnolife5163 3 дня назад
You certainly can do that as well before you do sharding per user id
@aparnamane1164 3 дня назад
Hey thank you for the video, really appreciate it. Just the audio was bit out of sync from barrier injection part ...
@mohd.tahauddin9001 3 дня назад
This channel might very well be the best system design channel on RUclips today. Honored to be learning from you Jordan!
@maxjudkevitch7104 3 дня назад
hey jordan! first of all you rock! and you explain things better than literally any teacher I ever had.. but I have a question - you didn't dive into the followers-following mechanism at all, now that you moved from SQL to Cassandra and there's no many-to-many table anymore, where is that data stored? in lists under each user? there's a table for every user with all his followers?
@jordanhasnolife5163 3 дня назад
Yep you got it!
@karanbirsingh3897 4 дня назад
Materializing conflicts is not something which the Database does by itself, right? It would be something which the person deciding the schema and building the application should think about. Something like "Looks like I will run into phantoms here. Let me add default entries for all possible values so that it would be easier to acquire locks"
@jordanhasnolife5163 3 дня назад
Yep!
@Secret4us 4 дня назад
Thanks for this vid on Hadoop. I've been wondering about it.
@karanbirsingh3897 4 дня назад
In snapshot isolation, while reading the data, the transaction should know its transaction number, right? Otherwise it wouldn't know which snapshot to read. For each row, it should read data updated by the latest preceding transaction which updated that row. You've said that for a single node DB, Write Ahead Log can be used to order transactions. So whichever transaction hits the Write Ahead Log gets the earlier transaction number. But what if reads happen before a write in a transaction? How would the reads know their transaction number if the transaction hasn't hit the WAL yet? Do they just kind of query the WAL at the beginning of their execution to get the latest transaction number? Also, I checked with Google Gemini that Snapshot Isolation is the minimum level which prevents lost updates. I believe this can only happen if while writing the updated data to the counter, the DB knows which snapshot of the data was read by this transaction. This way it will abort one of the two transactions trying to make an incorrect update. How does the DB know which snapshot was read by the transaction making this write? I think this will again be possible if any transaction gets the latest transaction entry in the WAL while starting its execution and then adds itself as the latest entry in the WAL irrespective of whether it has a read or write operation at the beginning of its own transaction. But in that case should it still be called a Write Ahead Log? It would be something like a Transaction log.
@karanbirsingh3897 День назад
@jordanhasnolife5163 tl;dr A transaction would only reach the Write ahead log when it has to do a write which would cause issues if we use WAL for ordering 3:28 3:30 3:31 transactions
@curiossoul 4 дня назад
I ddint understand the db scehma. How does chat id helsp to find other user in non group chat session. A chat need atleast 2 users
@jordanhasnolife5163 4 дня назад
You have a table called chatuser chatId userId Even for non group chats you'll just have two rows for a given chat Id
@stepkamipt 4 дня назад
Важно ли нам вообще посчитать ТОЧНОЕ число пользователей, а не +-100 если у нас всё такое высоконагружееное, что пользователей миллионы?
@vetiarvind 4 дня назад
i think I've seen this content in DDIA chapter 2. Maybe good to put a reference link.
@jordanhasnolife5163 4 дня назад
You are correct!
@aparnamane1164 4 дня назад
Hey, thank you for the video. Very nicely explained.Out of curiosity which podcast you referred at the start? And do you have any tech podcasts recommendation? Thank you in advance
@jordanhasnolife5163 4 дня назад
All In, though I wouldn't really call it a tech podcast!
@deepanshumehta1442 4 дня назад
Really appreciate Jordan for bringing this on youtube. I have some fundamental questions around generating unique shortURLs. Why not just generate random numbers vs hashing the long url ? Are there any security/performance implications ?
@jordanhasnolife5163 4 дня назад
That works too! As long as they're distributed evenly!
@pollathajeeva23 4 дня назад
What is the podcast you mentioned?
@jordanhasnolife5163 4 дня назад
All In
@vetiarvind 5 дней назад
Thank you Jordan. Vector embeddings is done using it's own database optimised for this case usually or maybe stored as a json or list for product? I think I got the tech lead job here in Thailand that I was aiming for. (apparently I'm an interview rockstar from the feedback) and I think it's partly because of binge watching your videos. I'm totally here for the tech of course😅
@jordanhasnolife5163 4 дня назад
Hahaha congratulations!! Well deserved and good luck in your new role!! The embeddings are just stored in memory in this paper, but in theory you can store them in a vector DB
@Secret4us 5 дней назад
The next great distributed DB : "Wal Eye!"
@business_central 5 дней назад
can anyone enlighten me about the last example ? If we real all values from snapshot 15 and before, wouldn't the total value be different than 1 Million, and we will have an inconsistent read ?
@jordanhasnolife5163 4 дня назад
The point here is just that when we make a read at a specific timestamp, and read many rows at once, we will see the DB in a consistent state. If we read many rows at once using different timestamps than we can expect to see things in an inconsistent state.
@business_central День назад
@@jordanhasnolife5163 Thank you ! Can't believe the GigaChad himself replied.
@Secret4us 5 дней назад
Good video. Maybe they could also add a heartbeat between shards / systems to get a hop length and use that as the margin of error eliminating the dependency on clock accuracy. They could also watch for drift in that margin and do qc on it. Or do they already do all that.
@jordanhasnolife5163 4 дня назад
I believe they do yeah, the drift needs to be below a certain amount
@geoaxis 5 дней назад
thank you for great video solution: I am just trying to practice math here. But the storage requirements come out to be 10 PB. cross checking with ChatGPT "If you calculate 1 billion × 100 × 100 KB, it equals 10,000,000,000,000 KB. To convert this into TB, divide by 1,000,000,000 (since 1 TB = 1 billion KB), which gives 10,000 TB. Further converting to PB, divide by 1,000 (since 1 PB = 1,000 TB), resulting in 10 PB. So the final answer is 10,000 TB or 10 PB."
@geoaxis 5 дней назад
just saw other comment now .
@FarzanHashmi 5 дней назад
What podcast if you don't mind me asking? just curious
@jordanhasnolife5163 4 дня назад
All in - it's not really a technical podcast I'd say, but the paper was mentioned nonetheless
@FarzanHashmi 4 дня назад
@@jordanhasnolife5163 thx dad
@clementdato6328 5 дней назад
Great illustration. You get me interested into architecture. Very underrated channel.
@lukezhuo8017 5 дней назад
Regarding the paper’s sparsity of detail on generating embeddings, if you’re interested in going down the RecSys rabbit hole, there are some famous models for multi-stage ranking/recommendation. i.e. Two Towers Neural Networks, Deep Factorization Machines, Deep and Cross Networks (DCN V2), and Deep Hierarchical Ensemble Network (DHEN)
@jordanhasnolife5163 5 дней назад
Thanks Luke!! I appreciate it, I'll take a look! Certainly lacking context here so ill take what I can get
@by301892 5 дней назад
That opening comment is crazy
@jordanhasnolife5163 4 дня назад
Don't know what you're talking about

Jordan has no life

Видео

Комментарии