XML is a markup language used to define custom document formats and data exchange standards. It allows users to define tags and attributes to structure text-based data. XML documents must adhere to rules like having matching start/end tags and a single root element to be considered well-formed. Document Type Definitions (DTDs) can be used to establish a fixed vocabulary and structure for XML documents in an application. XPath and XQuery are query languages that allow retrieving and manipulating parts of XML documents and datasets based on element names, attributes, values and structures.