So, in the next series of videos, we are going to talk about how we represent data in bioconductor. And first, we're going to talk a little bit about the basic data types we think of when we think about experimental data. In bioconductor, we tend to think of data as consisting, as being a one of three different types. There's experimental data, which is data that we have collected with some high-throughput experiment, for example sequencing, or a micro experiment. The data, typically, is either sequence reads or live sequences or numerical measurements of, say, the expression of a gene. Together with the experimental data, we have metadata on the experiment. Metadata on the experiment is things such as, a kind of annotation on the experiment. It's telling us more information about what do these numbers actually mean and where do they come from. So, an example is information about the samples that was profiled in the experiment. That a given sample is male or female, the age, the ethnic origin of the person and so on. And finally, we have annotation data. An annotation data is data we graph most often from big corporate databases that gives context to the experiment. It could be information about conservation. It could be information about nearby genes or local CBT content. Besides the three different data types, we also have different levels of data. These days, when we do a high-throughput experiment in biology, we typically start off with some really raw unprocessed data that is typically really big. In this case here, we show a little bit of a few reads from next generation sequencing experiment. And through a process called pre-processing, we take this raw data and we transform it into a somewhat more tidy and interpretable data. We typically go through some steps that we try to normalize the data, to make different samples more comparable. And in this, sometimes rather long process, we may have several intermediate products, like role reads versus a line reads versus summarized reads in a high-throughput experiment. Finally, when we have pre-processed the data, we do some kind of statistical analysis, that's where a lot of the analytical labor happens. And wind up with some results, which is basically, what our conclusions of this experiment. So, the way we represent data in bioconductor is we have a series of data containers. And that's an important concept, we have this great retrospective quote from Robert Gentleman who created bioconductors. He also was one of the two co-creators of R. Talking about how this concept of data containers have been an important thing in the success of bioconductor. He says here in the quote that, we may have different ways of getting the data, and we may have different microarray vendors of different types of experiments, but at a certain point in time, if we do an experiment where we measure gene expression, we end up with some data on the expression of some genes. And if we have a common data container for that type of data, we can then create analytical tools that just works in this common data container and is going to make life a lot simpler for us. So, what we're saying is we're representing in this little scheme, we imagine that we have different ways of getting the raw data. For example, we may have different vendors in different experiments, that this is between experiments, we may have different vendors and different microarray vendors that represent the data differently. We may have different sequence combiners that means that we have to pre-process the data slightly differently. And typically, the pre-processing we do is tightly integrated with the specific type of technology we're using. I'm not just saying sequencing technology but I'm also saying sequencing instrument. But at a certain point in time, the pre-processing result in data that comes into a common data container. And from this time on, we don't have to remember too much about where did the data really come from. That gives a lot of power, means that the analysis step that we have here on the slide can be joined between these different vendors. This is going to be a little bit more clear with the examples we have in the following.