3 hours ago · Tech · 0 comments

Although most of my time-series data still lives in InfluxDB, I also have VictoriaMetrics running and playing host to daily todo list stats along with a few other things. However, I recently realised that I screwed up when deploying it: I'd assumed that de-duplication of data was a given, but actually dedupe is disabled by default. In order to fill gaps, my cronjobs tend to write data for multiple days at a time, which has resulted in duplication of the data, causing inflation when reporting total values. This short post talks about enabling the deduplication feature in VictoriaMetrics as well as tidying up existing data. Impact of Duplicated Data Quite a few of the stats that I collect in VictoriaMetrics are ultimately visualised by summing or counting values: sum_over_time(todo_list_completed{db="workload_stats"}[1d]) This results in a graph like the following Now, I work bloody hard and do get through a lot in a day, but 1600 items is a ridiculous amount: assuming an 11 hour work…

No comments yet. Log in to reply on the Fediverse. Comments will appear here.