/storage/seat is massive

This partition getting too big causes issues. Seat Stands for

  • S – stats
  • E – Events
  • A – alarms
  • T – Tasks

For this article we’re interested in when the growth is in events. In the example below we see that the eventtblsp is 54GB. this is the largest directory in /storage/seat

du -sh /storage/seat/vpostgres/*
1.3M alarmtblsp
54G eventtblsp
2.9G hs1tblsp
900M hs2tblsp
308M hs3tblsp
400M hs4tblsp
2.3G tasktblsp

Firstly you need to identify what records are filling the database

/opt/vmware/vpostgres/current/bin/psql -d VCDB -U postgresĀ 

SELECT COUNT(0),
event_type
FROM vpxv_event_all
GROUP BY event_type
ORDER BY 1 DESC LIMIT 5;

The output in this case is

16297499 | vim.event.UserLoginSessionEvent
16295260 | vim.event.UserLogoutSessionEvent
2690546  | DatastoreFileDownloadEvent
1187037  | vim.event.TaskEvent
486828   | esx.problem.visorfs.ramdisk.full

You should then investigate using the vsphere client to see what’s issuing all these events.

Below here we can should investigate what dcui@127.0.0.1 is doing and try and stop it

Often it’s not so simple to stop these events. They’re often coming from backup software which vmware admins have no control over. In this instance we’ve used a workaround. This creates a procedure in the database which removes these unwanted events regularly.

Please change the procedure as appropriate to your issue

ldt_keep_from - this can be changed to keep any amount of days of the event. In our example we're keeping 10 days of information.

lv_event_type1 and lv_event_type2 - these are the 2 events we have an issue with in this example

If you've only 1 event then you'll need to change the code (event_type = '''||
           lv_event_type1||''' or event_type = '''||lv_event_type2||''')
to
 (event_type = '''|| lv_event_type1||''')

/opt/vmware/vpostgres/current/bin/psql -d VCDB -U postgres

CREATE OR REPLACE FUNCTION remove_some_Events ()
returns VARCHAR as $$
DECLARE
event_partition VARCHAR (2000);
lv_event_type1 VARCHAR (200);
lv_event_type2 VARCHAR (200);
ldt_keep_from date;
ln_count_rows_removed_event int;
ln_count_rows_removed_event_arg int;
ln_rowcount int;
DECLARE return_count VARCHAR := 'DEFAULT';
BEGIN
 ln_count_rows_removed_event = 0;
 ln_count_rows_removed_event_arg = 0;
 -- this is configuring the amount of this data you want to keep.
 -- Change this as you feel comfortable.
 ldt_keep_from = timezone('UTC', now())  - (10 || ' days')::interval;
 --- These are the events we're going to remove for issue 1
 lv_event_type1 = 'vim.event.UserLoginSessionEvent';
 lv_event_type2 = 'vim.event.UserLogoutSessionEvent';
  FOR part IN 1..92  LOOP
 event_partition = 'delete  from VPX_EVENT_ARG_' || CAST(part AS TEXT)
            || ' where event_id in (select event_id from  VPX_EVENT_' || CAST(part AS TEXT)
            || ' where create_time < '''||ldt_keep_from ||''' and ((event_type = '''||
           lv_event_type1||''' or event_type = '''||lv_event_type2||''')))';
         EXECUTE event_partition;                     get diagnostics ln_rowcount =  row_count;                     ln_count_rows_removed_event_arg = ln_count_rows_removed_event_arg + ln_rowcount;                     ln_rowcount = 0;
 event_partition = 'delete  from VPX_EVENT_' || CAST(part AS TEXT)
           || ' where (event_type = '''||lv_event_type1||''' or event_type = '''||lv_event_type2||''')';
             EXECUTE event_partition;
                         get diagnostics ln_rowcount =  row_count;
                         ln_count_rows_removed_event = ln_count_rows_removed_event + ln_rowcount;
                         ln_rowcount = 0;
 END LOOP;             return_count = 'vpx_events_removed=' || ln_count_rows_removed_event || ',vpx_event_arguments_removed=' || ln_count_rows_removed_event_arg ;             RETURN return_count;
 END
 $$language plpgsql;

this can be run manually the first time. it may take a long time depending on the amount of data which need to be removed. It can be executed with all other services running.

VCDB=# select remove_some_Events();

vpx_events_removed=1988331,vpx_event_arguments_removed=78653978
(1 row)

once your happy you can package it in a simple shell script.

vi /storage/updatemgr/run_cleanup.sh

#!/bin/sh
/opt/vmware/vpostgres/current/bin/psql -U postgres -d VCDB -c "SELECT * FROM clean_some_Events();"
:wq! 
chmod 777 /storage/updatemgr/run_cleanup.sh
./storage/updatemgr/run_cleanup.sh

Now it’s best to schedule this to run every night.

crontab -e
0  2 * * * /storage/updatemgr/run_cleanup.sh
:wq!

       

Performance Statistics in vCenter using too many resources

Statistics are important to show how your VM’s are performing. However this may sometimes become a balancing act. Too many performance statistics can degrade the performance. These stats are like a data warehouse and can effect the critical operational side of virtual center.

The stats are /storage/seat.

Here is an example of a setup which has 130GB of hist stat.

du -sh /storage/seat/vpostgres/*
1.3M /storage/seat/vpostgres/alarmtblsp
54M /storage/seat/vpostgres/eventtblsp
26G /storage/seat/vpostgres/hs1tblsp
39G /storage/seat/vpostgres/hs2tblsp
22G /storage/seat/vpostgres/hs3tblsp
11G /storage/seat/vpostgres/hs4tblsp
2.3M /storage/seat/vpostgres/tasktblsp
hs1tblsp  --  past day stats
hs2tblsp  --  past week stats
hs3tblsp  --- past month stats
hs4tblsp  --  past year stats

Double check that your statistic levels are set to default. Anything higher than which on any of these columns will cause a big increase in the amount of data collected.

If you have vRealize Operations VROPS. You maybe be able to disable this stat collection and just rely on VROPS for your statistics.

You may want to clear out some of this data. This KB 2110031 covers how to trim out some data.

I’m a DBA however and I do love to truncate. This is fast and efficient. First step is to stop all the services except postgres. Then copy and paste in the script below. All your historical stats will be removed once this script is run.

service-control --stop --all
service-control --start vmware-vpostgres
/opt/vmware/vpostgres/current/bin/psql -U postgres VCDB

Copy and paste in the code below and press return

CREATE OR REPLACE FUNCTION cleanup_hist_stats_truncate (SaveData INTEGER)
 returns void as $$
 DECLARE r record;
 DECLARE sql_stmt varchar(2000);
 DECLARE affected_rows integer;
 DECLARE affected_rows_step1 integer;
 BEGIN
 TRUNCATE TABLE VPX_SAMPLE_TIME1;
 TRUNCATE TABLE VPX_SAMPLE_TIME2;
 TRUNCATE TABLE VPX_SAMPLE_TIME3;
 TRUNCATE TABLE VPX_SAMPLE_TIME4;
 FOR r IN SELECT table_name FROM information_schema.tables WHERE table_name LIKE 'vpx_hist_stat%' AND table_type='BASE TABLE'     LOOP         sql_stmt = 'TRUNCATE TABLE ' || r.table_name;         EXECUTE sql_stmt;                        END LOOP;
 END
 $$language plpgsql;
 select cleanup_hist_stats_truncate(1);
            

vcac database growing too large 7.x

First step is to check your postgres logs.

These are on /storage/db/pg_data/pg_log

more postgresql.log

You may find, ERROR,XX001,”could not read block 0 in file “”base/8198863/8987722″”: read only 0 of 8192 bytes”,,,,,”automatic vacuum of table “”vrhb.public.docs_core_authcredentialsservicestate”””,,,,””

The fix to stop your database from growing.

psql vrhb

reindex table vrhb.public.docs_core_authcredentialsservicestate;

vacuum full reindex table vrhb.public.docs_core_authcredentialsservicestate;

\c vcac

vacuum analyze;

\q

You can verify that the messages are gone from your postgresql.log