EPG data format

Here's the format of the EPG information (as far as I understand it).

The Sky EPG divides into several sections on different PIDs.

PIDs $30 to $37 carry EPG event times/lengths/titles in tables $A0 to $A3. Each PID is used for one day's worth of data, from today up to 7 days ahead. The PID used is

(epg_date for today) MOD 8 + $30 The data is further partitioned into four 6-hour time bands (midnight-6am, 6am-noon, noon-6pm, 6pm-midnight, all GMT). Table $A0 carries data for the midnight-6am band, $A1 for 6am-noon etc.

PIDs $40 to $47 carry corresponding synopsis strings for the events on PIDs $30 to $37. Tables $A8 to $AB are used, again divided by the four 6-hour time bands. The 16-bit epg_event_id is used to tie the synopsis to the corresponding event.

The "Default Transponder" carries the full event and synopsis data for 8 days. Other transponders seem to only carry EPG format event data for the current 6-hour chunk and the next 6-hour chunk, and no EPG synopsis data at all. This is why, when watching one programme, you can use the banner to look at EPG data for all channels but only for between 6 and 12 hours forward. (You can also see synopsis data for the current programme, but this is extracted from the industry-standard EIT entry, not the proprietary EPG.)

These tables $A0 to $A3 and $A8 to $AB follow normal table format rules in the DVB-SI specification for private tables. The fields appear to be:

table_id 8 $A0 $A1 $A2 $A3 $A8 $A9 $AA or $AB. section_syntax_indicator 1 * reserved_future_use 1 * reserved 2 * section_length 12 * epg_service_id 16 Different encoding from the usual 16-bit service_id used in EIT. reserved 2 * version_number 5 * current_next_indicator 1 * section_number 8 * last_section_number 8 * epg_date 16 Days counting from epg epoch for {} { epg_event_id 16 Different for each event in EPG ? 4 $B or $F descriptors_loop_length 12 * for {} { descriptor() } } CRC_32 32 * Fields marked * follow the normal definition for DVB SI. The descriptors are $B3 $B5 $B6 $C3 $C4 in tables $A0 to $A3, and $B9 for tables $A8 to $AB. The format of these descriptors again follows DVB-SI conventions. The important ones are $B5 which carries the EPG event information: descriptor_tag 8 $B5 for EPG event descriptor_length 8 Bytes following this one in descriptor epg_time 16 In 2-second units from midnight GMT epg_length 16 In 2-second units epg_genre 8 Programme genre epg_flags 16 Flags for sex, violence, subtitles, etc? epg_huffenc_title rest Huffman encoded title string and $B9 which carries the synopsis: descriptor_tag 8 $B9 for EPG synopsis descriptor_length 8 Bytes following this one in descriptor epg_huffenc_synopsis rest Huffman encoded synopsis string

A similar format is also used for descriptor $B4 which carries "Series Link" information, which describes where to find the next episode in a series. Unlike the above descriptors, which appear in private tables, this one is inserted into the DVB standard EIT (table $4E on pid $12). Since $B4 is in the range for "user defined" descriptors, it will get ignored by a DVB standard receiver. Series link only seems to appear for the programme currently being transmitted, not for the next programme (which also appears in table $4E).

descriptor_tag 8 $B4 for series link descriptor_length 8 Bytes following this one in descriptor epg_service_id 16 Service id (EPG one not regular EIT one) epg_event_id 16 Event ID within EPG epg_date 16 Days counting from epg epoch epg_time 16 In 2-second units from midnight GMT epg_length 16 In 2-second units epg_genre 8 Programme genre epg_flags 16 Flags for sex, violence, subtitles, etc? epg_huffenc_title rest Huffman encoded title string

Huffman encoding works by assigning binary bit strings to characters or strings of characters. See HUFFMAN.MAP for the mapping as far as I have determined it. PVRHUFF.PAS contains the code to convert a byte string from the descriptor into the corresponding text string. These files can be found in the Software section.