High Performance Data Streaming with Amazon Kinesis: Best Practices and Common Pitfalls

Поделиться
HTML-код
  • Опубликовано: 19 окт 2024

Комментарии • 23

  • @gsminfy
    @gsminfy 4 года назад +35

    Index
    Kinesis Data Streams intro = 7:00
    Lambda Poison messages = 10:18
    Kinesis Enhanced Fan out = 15:09
    Kinesis Firehose ITL = 19:25
    Firehose with Glue catalog for schema/format conversion = 22:55
    Glue catalog demo = 27:55
    Using custom prefixes demo = 32:05

  • @Art-kz6zf
    @Art-kz6zf 2 года назад

    Best introductory presentation of Kinesis that I could find. Thank you!

  • @sbvuyyuru
    @sbvuyyuru 2 года назад

    It's one of best tech talk i have seen so far, it very clear and amazing presentation

  • @anuyajoshi4360
    @anuyajoshi4360 4 года назад +2

    Excellent video, entire session had multiple aspects learn.
    Technical information overview no-doubt was good but below points added benefits -
    a) conceptual thinking
    b) how to reduce redundancy with EFO-Pipe
    c) composed delivery of concept from start till end.

    • @nowtify3227
      @nowtify3227 4 года назад

      do you have any idea about camera alert system to sms using kinesis

  • @Rk23able
    @Rk23able 2 года назад

    Thank you perfectly explained, learned something new and re-enforced existing concepts.

  • @najah555
    @najah555 3 года назад +2

    Great session Randy!!! As always!

  • @GeorgeZoto
    @GeorgeZoto Год назад

    Thank you for this informative session even if it was a while ago :)

  • @dharam35
    @dharam35 3 года назад +1

    Very neat ! Thank you for a great knowledge sharing !

  • @progermv
    @progermv 3 года назад

    Great session!

  • @АлексейРоговский-ж8у
    @АлексейРоговский-ж8у 4 года назад +1

    what i should use in case if i need to fetch all data in stream, convert , transform all data and send result by API ? is KDS support converting/transformation in real time or i have to use Firehose?

    • @gsminfy
      @gsminfy 4 года назад +1

      if you are talking about format conversions or data transformations/manipulations/cleansing etc, I would read the streaming data through a Lambda function, do the transformation and push it to DynamoDB or an S3 bucket or other integrations downstream. if you can do with near-real time, I would go with firehose.

  • @jaysonDfernandez
    @jaysonDfernandez 3 года назад

    Hi sir what's the difference between iot core and kinesis. Thank you

  • @sakthivel6189
    @sakthivel6189 4 года назад

    Good session! Thanks Randy...

  • @YEO19901
    @YEO19901 3 года назад

    Super !!

  • @sanooosai
    @sanooosai 3 года назад

    GOOD BETA

  • @m_nouman_shahzad
    @m_nouman_shahzad 4 года назад

    10:20 Poison Messages

  • @sukulmahadik0303
    @sukulmahadik0303 3 года назад +8

    *Notes Part 2*
    *Benefits of Kinesis (managed service) for Streaming*
    1) Kinesis is a managed service and so no infrastructure provisioning and no management.
    2) Automatically scales during re-shard operations.
    3) No stream consumption costs when no new records to process.
    4) High Available and secure (allows encryption)
    ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    *Producers/Stream Ingestion:*
    On producers side there are number of tools and utilities for ingesting data into kinesis.
    Producers can be broken down into 3 categories.
    1) AWS toolkit/libraries - AWS SDK, AWS mobile SDK, Kinesis producer library)KPL) , Kinesis Agent (we install this on our systems to be able to aggregate the data at the source and then persist out in batches)
    2) AWS Service integrations - AWS IoT, AWS Cloudwatch Events and Logs, Amazon DMS
    3) 3rd Party Offering - tools like Log4j, Flume or FluentD can have integrations with Kinesis. These normally help us pull logs from applications and push them into kinesis.
    A producer is an application that writes data to Amazon Kinesis Data Streams. There are several methods for writing data to your data stream:
    • You can develop producers using the Kinesis Agent. It is a standalone Java software application that offers an easy way to collect and send data to Kinesis Data Streams.
    • You can develop producers using the Amazon Kinesis Data Streams API with the AWS SDK for Java. There are two different operations in the Kinesis Data Streams API that add data to a stream: PutRecords and PutRecord.
    • You can develop producers using the Kinesis Producer Library (KPL). KPL is an easy-to-use, highly configurable library that helps you write to a Kinesis data stream. It acts as an intermediary between your producer application code and the Kinesis Data Streams API actions. You can monitor the KPL with Amazon CloudWatch.
    ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    *Consumers/Stream Processing:*
    Consumers also can be broken into 3 categories:
    1) Kinesis - Kinesis Data Analytics (SQL and Flink) , Kinesis Client Library + Connector Library that will allow us to create our own applications and scale those out using EC2 instances + Auto Scaling.
    2) AWS Services - AWS Lambda to build completely serverless applications, Amazon EMR.
    3) 3rd Party offerings - Spark, Databricks, Splunk , Apache Storm etc
    A consumer is an application that reads and processes all data from a Kinesis data stream. There are several methods for reading data from your data stream:
    - You can use an AWS Lambda function to process records in your data stream.
    - You can use an Amazon Kinesis Data Analytics application to process and analyze data in your data stream using SQL or Java.
    - You can use a Kinesis Data Firehose to read and process records from your data stream.
    - You can develop a consumer to read and process records from your data stream using the Kinesis Client Library (KCL). The KCL handles many complex tasks associated with distributed computing, such as load balancing across multiple instances, responding to instance failures, checkpointing processed records, and reacting to resharding. The KCL enables you to focus on writing record-processing logic for your data stream.