Patent attributes
Embodiments of the present invention allow multiple data streams to be analyzed as a single data set. The single data set may be described as a stream set herein. The multiple streams that are included in the stream set may be specified through a user script or query. For example, a query may be used to gather all streams created within a date range. The query could include one or more filters to gather certain information from the data streams or to exclude certain data streams that otherwise are in the query's range. A stream may be an unstructured byte stream of data. The stream may be created by append-only writing to the end of the stream. The stream could also be a structured stream that includes metadata that defines column structure and affinity/clustering information.