Pinot Version | 0.9.3 |
---|---|
Code | startreedata/pinot-recipes/managed-offline-flow-automatic-scheduling |
/config/controller-conf.conf
, the contents of which are shown below:
controller.task.scheduler.enabled=true
enables the automatic running of the RT2OFF jobcontroller.task.frequencyPeriod=5m
configures it to run every 5 minutesrealtime.segment.flush.threshold.rows
config is intentionally set to an extremely small value so that the segment will be committed after 10,000 records have been ingested.
In a production system this value should be set much higher, as described in the configuring segment threshold guide.RealtimeToOfflineSegmentsTask
, which is extracted below:
schedule
parameter indicates when this task will be run.
The value is a Quartz Cron expression and in this case we have the job running once a minute.
bufferTimePeriod
and bucketTimePeriod
, and schedule
are intentionally set to very low values so that it’s easier to see how they work.In a production setup, bufferTimePeriod
and bucketTimePeriod
would usually be set to a time of 1 day or more, and schedule
could be set to run once a day.RealtimeToOfflineSegmentsTask
, see the Manually scheduling real-time to offline job guide.events
Kafka topic, by running the following: