2018-10-22         Ruth

Kafka block.on.buffer.full default value

I was going over the kafka(0.9.0.1) producer configuration, and the property block.on.buffer.full in the documentation says: When our memory buffer is exhausted we must either stop accepting new records (block) or throw errors. By default this setting is true and we block, however in some scenarios blocking is not desirable and it is better to immediately give an error. Setting this to false will accomplish that: the producer will throw a BufferExhaustedException if a recrord is sent and the buffer space is full.So theoretically it should be true, but in that same d...

 apache-kafka                     2 answers                     101 view
 2018-10-22         Kennedy

org.apache.kafka.common.errors.RecordTooLargeException

I'm doing some kafka Streams aggregation and writing aggregated records to the topic and getting the following errors. I'm using custom json serde for the aggregation helper class. The solution I found on some blogs to this problem is to increase the max.request.size. Though I increase the max.request size to from default to 401391899, the serialized aggregation message size keeps increasing on the subsequent writes to the topic. Running the streams after 10 mins, the below error shows up. Not sure if the problem is with my serde or should I change any config settings oth...

 apache-kafka                     1 answers                     2 view
 2018-10-22         Simon

How to unit test Kafka Streams

While exploring how to unit test a Kafka Stream I came across ProcessorTopologyTestDriver, unfortunately this class seems to have gotten broken with version 0.10.1.0 (KAFKA-4408)Is there a work around available for the KTable issue?I saw the "Mocked Streams" project but first it uses version 0.10.2.0, while I'm on 0.10.1.1 and second it is Scala, while my tests are Java/Groovy.Any help here on how to unit test a stream without having to bootstrap zookeeper/kafka would be great.Note: I do have integration tests that use embedded servers, this is for unit tests, aka fast, sim...

 apache-kafka                     2 answers                     6 view
 2018-10-22         George

Getting complete copy of input data using Kafka connect on the sink nodes

I was trying to write a simple case of using kafka connector. My setup involves using three nodes N1,N2 and N3.N1 is the source and N2, N3 are the sink nodes in my case.I am writing data to a text file(say input.txt) on Node N1 and using the standalone kafka connector hope to see a text file with content similar to input.txt on the nodes N2 and N3.I am using the REST API to make changes in topic name, file name and tasks.max.However, during the experiments I was unable to get a complete copy of the input.txt on both nodes(N2 and N3) at the same time.Also tuning the value of...

 apache-kafka                     1 answers                     29 view
 2018-10-22         Wayne

How to use kafka and storm on cloudfoundry?

I want to know if it is possible to run kafka as a cloud-native application, and can I create a kafka cluster as a service on Pivotal Web Services. I don't want only client integration, I want to run the kafka cluster/service itself?Thanks,Anil I can point you at a few starting points, there would be some work involved to go from those starting points to something fully functional.One option is to deploy the kafka cluster on Cloud Foundry (e.g. Pivotal Web Services) using docker images. Spotify has Dockerized kafka and kafka-proxy (including Zookeeper). One thing to kee...

 apache-kafka                     2 answers                     31 view
 2018-10-22         Harold

Kafka org.apache.kafka.connect.converters.ByteArrayConverter doesn't work as values for key.converter and value.converter

I'm trying to build a pipeline where I have to move binary data from kafka topic to kinesis stream with out transforming. So I'm planning to use ByteArrayConverter for worker properties setup. But I'm getting the following error! Although I could see the ByteArrayConverter class in hereon 0.11.0 version. I cannot find the same class under 3.2.x :(Any help would be much appreciated. key.converter=io.confluent.connect.replicator.util.ByteArrayConvertervalue.converter=io.confluent.connect.replicator.util.ByteArrayConverterException in thread "main" org.apache.kafka.common.conf...

 apache-kafka                     1 answers                     53 view
 2018-10-22         Modesty

Pushing avro file to Kafka

I have an existing avro file and I want to push the file data into kafka but it's not working /usr/bin/kafka-console-producer --broker-list test:9092 --topic test < part-m-00000.avroThanks If you want to publish Avro Messages, you can try kafka-avro-console-producer.$ ./bin/kafka-avro-console-producer \ --broker-list localhost:9092 --topic test \ --property value.schema='{"type":"record","name":"myrecord","fields":[{"name":"f1","type":"string"}]}' < avrofile.avroIt is part of confluent open source packagePlease refer the more details here. ...

 apache-kafka                     2 answers                     59 view
 2018-10-22         Boyd

Enforcing Event Sourcing schema on Kafka

To be clear, I am not trying to use Kafka as the data store for event sourcing, merely to replicate events.The Confluent Schema Registry for Kafka seems very interesting in that it can validate the schema for messages sent by producers to a topic. However, from what I understand it treats each topic like a container file - one schema per topic. This restriction doesn't work for an event source stream where for a single aggregate like File you will have multiple message schemas: FileCreated, FileMoved, FileCopied, FileDeleted. Putting each of these on a separate topic would ...

 apache-kafka                     1 answers                     61 view
 2018-10-22         Cedric

Flink streaming job switched to failed status

I hava a 8 nodes Flink cluster and a 5 nodes Kafka cluster to run a WordCount job. In the first case, lot of data is generated and pushed to Kafka and then Flink job is launched. Everything works fine in this case.While in the second case, Flink streaming job is launched first, then data is produced into Kafka topic. In this case, the Flink job is usually switched to failed status. Some times it fails immediately after the job is launched. Sometimes it fails several minutes after the job is launched.org.apache.flink.runtime.io.network.netty.exception.RemoteTransportExcepti...

 apache-kafka                     1 answers                     71 view
 2018-10-22         Dominic

Issue about workload balance in Flink streaming

I have a WordCount program running in a 4-worknodes Flink cluster which reads data from a Kafka topic. In this topic, there are lot of pre-loaded texts(words). The words in the topic satisfy Zipf distribution. The topic has 16 partitions. Each partition has around 700 M data inside.There is one node which is much slower than the others. As you can see in the picture, worker2 is the slower node. But the slower node is not always worker2. From my tests, it is also possible that worker3 or other nodes in the cluster can also be slower. But, there is always such a slow worker n...

 apache-kafka                     1 answers                     73 view
 2018-10-22         Gail

FlinkKafkaConsumer082 auto.offset.reset setting doesn't work?

I have a Flink streaming program which read data from a topic of Kafka. In the program, auto.offset.reset is set to "smallest". When test in IDE/Intellij-IDEA, the program could always read data from the beginning of the topic. Then I set up a flink/kafka cluster and produced some data into kafka topic. The first time I run the streaming job, it could read data from the beginning of the topic. But after that I stopped the streaming job and run it again, it will not read data from the beginning of the topic. How could I make the program always read data from the beginning of...

 apache-kafka                     1 answers                     74 view
 2018-10-22         Brady

Does Apache Kafka support receiving the most recent message per Id at start of subscription?

I am evaluating Message Brokers for a project and could not find a clear answer to whether Apache Kafka supports the following use case:Messages with additional attributes to filter on at receiving the messages are pushed to topics. These additional attributes can be considered as a primary key for each message, in the simplest form it is just one Id-Attribute (e.g. an Id of a sensor that (unregularly) produces measurement data).0 to n consumers receive these messages from the topics, eventually filtering on the primary key.Messages are not consumed at receiving them, so al...

 apache-kafka                     1 answers                     104 view
 2018-10-22         Nick

How to stop consuming data of a kafka consumer when there is no more data in a topic?

I am trying to read data from a kafka topic to python using confluent_kafka. After I create a consumer to read data, it never stops. So is there any configurations in kafka consumer telling the consumer to stop when there is no more data in that topic?Intuitively, when you use kafkacat in command line, there is a -e command to tell the consumer to stop loading more data. Streams are by definition unbounded, so it's a matter of semantics to say "no more data". As @cricket_007 says, you'd want to handle this in your application code if you've not received data within the w...

 apache-kafka                     1 answers                     41 view
 2018-10-22         Xaviera

Kafka Producer Slows down when putting different keys

I have a code that will send data to Kafka topic.public void sendMessage(String message, String key){ if (isAsync) { // Send asynchronously producer.send(new ProducerRecord<String, String>(topic,key,message), new ProducerCallback(key, message)); } else { // Send synchronously try { producer.send(new ProducerRecord(topic,key, message)).get(); } catch (Exception e) { e.printStackTrace(); // handle the exception } }}I am passing the data to the method using the code below:String Message ="Text message,Text messa...

 apache-kafka                     1 answers                     41 view
 2018-10-22         Blake

Kafka Connect Logging

Currently we are using a couple of custom connetor plugins for our confluent kafka connect distributed worker cluster. One thing that bothers me for a long time is that kafka connect writes all logs from all deployed connectors to one File/Stream. This makes debugging an absolute nightmare. Is there a way to let kafka connect log the connectors in different Files/Streams?Via the connect-log4j.properties I am able to let a specific class log to a different File/Stream. But this means that with every additional connector I have to adjust the connect-log4j.propertiesThanks ...

 apache-kafka                     1 answers                     43 view
 2018-10-22         Elma

Is kafka 2.0.0 a stable version that can be used for production?

I have to upgrade Kafka from 0.10.0 to 2.0.0 for a production system. Is it a stable release? Also, is there a concept of GA, RC, and M2 for Kafka releases? Yes Kafka 2.0.0 is a stable release and it can be used in production.Apache Kafka also uses Release Candidates before releasing new stable versions. For example 2.0.0 went through 3 RCs before reaching GA. (RC0, RC1, RC2).Apache Kafka however does not use Milestone Releases. A Timed-based release plan was voted by the community and is used to produce regular GA (production ready) releases. [XXX]

 apache-kafka                     1 answers                     44 view

Page 1 of 115  |  Show More Pages:  Top Prev Next Last