Big Data Training & Consulting

Get training & advice from experts

Contact Us

Kafka Training

Our Kafka course will teach admnistrators how to  secure and administer  a Kafka installation and how to engineer and maintain solutions for your streaming data needs. Application developers will understand how to  devleop robust applications that leverage the Kafka's API and implement  best practice for application development.  Learn how to build flexible data pipelines that scale to handle your Big Data processing requirements.

Kafka Course Outline

Kafka Introduction

  • Architecture
  • Overview of key concepts
  • Overview of ZooKeeper
  • Cluster, Nodes, Kafka Brokers
  • Consumers, Producers, Logs, Partitions, Records, Keys
  • Partitions for write throughput
  • Partitions for Consumer parallelism (multi-threaded consumers)
  • Replicas, Followers, Leaders
  • How to scale writes
  • Disaster recovery
  • Performance profile of Kafka
  • Consumer Groups, “High Water Mark”, what do consumers see
  • Consumer load balancing and fail-over
  • Working with Partitions for parallel processing and resiliency
  • Brief Overview of Kafka Streams, Kafka Connectors, Kafka REST
  • Create a topic
  • Produce and consume messages from the command line
  • Configure and set up three servers
  • Create a topic with replication and partitions
  • Produce and consume messages from the command line

     

Kafka Producer & Consumer Basics

  • Introduction to Producer Java API and basic configuration
  • Create topic from command line
  • View topic layout of partitions topology from command line
  • View log details
  • Use ./kafka-replica-verification.sh to verify replication is correct
  • Introduction to Consumer Java API and basic configuration

  • View how far behind the consumer is from the command line
  • Force failover and verify new leaders are chosen
Kafka Architecture
  • Motivation Focus on high-throughput
  • Embrace file system / OS caches and how this impacts OS setup and usage
  • File structure on disk and how data is written
  • Kafka Producer load balancing details
  • Producer Record batching by size and time
  • Producer async commit and commit (flush, close)
  • Pull vs poll and backpressure
  • Compressions via message batches (unified compression to server, disk and consumer)
  • Consumer poll batching, long poll
  • Consumer Trade-offs of requesting larger batches
  • Consumer Liveness and fail over redux
  • Managing consumer position (auto-commit, async commit and sync commit)
  • Messaging At most once, At least once, Exactly once
  • Performance trade-offs message delivery semantics
  • Performance trade-offs of poll size
  • Replication, Quorums, ISRs, committed records
  • Failover and leadership election
  • Log compaction by key
  • Failure scenarios

Advanced Kafka Producers

  • Using batching (time/size)
  • Using compression
  • Async producers and sync producers
  • Commit and async commit
  • Default partitioning (round-robin no key, partition on key if key)
  • Controlling which partition records are written to (custom partitioning)
  • Message routing to a particular partition (use cases for this)
  • Advanced Producer configuration
  • Use message batching and compression
  • Use round-robin partition
  • Use a custom message routing scheme
  • Embrace file system / OS caches and how this impacts OS setup and usage

Advanced Kafka Consumers

  • Adjusting poll read size
  • Implementing at most once message semantics using Java API
  • Implementing at least once message semantics using Java API
  • Implementing as close as we can get to exactly once Java API
  • Re-consume messages that are already consumed
  • Using ConsumerRebalanceListener to start consuming from a certain offset (consumer.seek*)
  • Assigning a consumer a specific partition (use cases for this)
  • Write Java Advanced Consumer
  • Adjusting poll read size
  • Implementing at most once message semantics using Java API
  • Implementing at least once message semantics using Java API
  • Implementing as close as we can get to exactly once Java API
Schema Management in Kafka
  • Avro overview
  • Avro Schemas
  • Flexible Schemas with JSON and defensive programming
  • Using Kafka’s Schema Registry
  • Topic Schema management
  • Validation of schema
  • Prevent producers that don’t align with topic schema
  • Validation of schema
  • Prevent Consumer from accepting unexpected schema / defensive programming
  • Prevent producers from sending messages that don’t align with schema registry
Kafka Security
  • SSL for Encrypting transport and Authentication
  • Setting up keys
  • Using SSL for authentication instead of username/password
  • Setup keystore for transport encryption
  • Setup truststore for authentication
  • Producer to server encryption
  • Consumer to server encryption
  • Kafka broker to Kafka broker encryption
  • SASL for Authentication
  • Overview of SASL
  • Integrating SASL with Active DirectoryAvro overview

Kafka Admin

  • OS config and hardware selection
  • Monitoring Kafka KPIs
  • Monitoring Consumer Lag (consumer group inspection)
  • Log Retention and Compaction
  • Fault tolerant Cluster
  • Growing your cluster
  • Reassign partitions
  • Broker configuration details
  • Topic configuration details
  • Producer configuration details
  • Consumer configuration details
  • ZooKeeper configuration details
  • Tools to managing ZooKeeper
  • Accessing JMX from command line
  • Using dump-log segment from command line
  • Replaying a log (replay log producer)
  • Re-consume messages that are already consumed
  • Setting Consumer Group Offset from command line
  • Kafka Migration Tool − migrate a broker from one version to another
  • Mirror Maker − Mirroring one Kafka cluster to another (one DC/region to another DC/region)

Kafka Streams

  • Architecture Overview
  • Concepts
  • Design Patterns
  • Examples

Contact Us

Please contact us for any queries via phone or our contact form. We will be happy to answer your questions.

3 Appian Place,373 Kent Ave
Ferndale,
2194 South Africa
Tel: +2711-781 8014 (Johannesburg)
  +2721-020-0111 (Cape Town)
ZA

Contact Form

contactform.caption

Contact Form