这几天要做一个检测XML文件语法的功能,通过XSD定义好的规则进行匹配检测:
XML其中一条记录(其中“NUID”这一项特意改错误用于测试):
<Record ChipsetType="xxxxxxxxx" ChipsetExtension="xRxDxxx" ChipsetCut="80" NUID="ZE2D6525" NUIDCheckNumber="18CEBD29" STB_CA_SN="89349247" DeviceManufacturerSN="JC1S03200002" CSCDataConfig="030404DC" CSCDataRevisionNumber="00000000" CSCDataCheckNumber="39C5D43E" CERTReportCheckNumber="FE404962A674CA06" PersonalizationDate="2020/08/18 16:57" CRC="6A4092E0" />
XSD其中部分规则定义如下(主要拿“NUID”这一项做测试):
<!--*************************************************************************-->
<!-- Define new type LogRecordType -->
<!--*************************************************************************-->
<xs:complexType name="RecordType">
<xs:attribute name="ChipsetType" type="String20Type" use="required"/>
<xs:attribute name="ChipsetExtension" type="String20Type" use="required"/>
<xs:attribute name="ChipsetCut" type="String20Type" use="required"/>
<xs:attribute name="NUID" type="Int32Type" use="required"/>
<xs:attribute name="NUIDCheckNumber" type="Int32Type" use="required"/>
<xs:attribute name="STB_CA_SN" type="Int32Type" use="required"/>
<xs:attribute name="CSCDataConfig" type="Int32Type" use="required"/>
<xs:attribute name="CSCDataRevisionNumber" type="Int32Type" use="required"/>
<xs:attribute name="CSCDataCheckNumber" type="Int32Type" use="required"/>
<xs:attribute name="CERTReportCheckNumber" type="Int64Type" use="required"/>
<xs:attribute name="DeviceManufacturerSN" type="String30Type" use="required"/>
<xs:attribute name="PersonalizationDate" type="nvdateType" use="required"/>
<xs:attribute name="CRC" type="Int32Type" use="required"/>
</xs:complexType>
<!--*************************************************************************-->
<!-- Define new type Int32Type -->
<!--*************************************************************************-->
<xs:simpleType name="Int32Type">
<xs:restriction base="xs:string">
<xs:pattern value="([A-Fa-f0-9]{8})"/>
</xs:restriction>
</xs:simpleType>
实现代码如下:
import lxml.etree as ET
def validateXMLByXSD(file_xml, file_xsd):
""" Verify that the XML compliance with XSD
Arguments:
1. file_xml: Input xml file
2. file_xsd: xsd file which needs to be validated against xml
Return:
No return value
"""
try:
print("Validating:{0}".format(file_xml))
print("xsd_file:{0}".format(file_xsd))
xml_doc = ET.parse(file_xml)
xsd_doc = ET.parse(file_xsd)
xmlschema = ET.XMLSchema(xsd_doc)
xmlschema.assert_(xml_doc)
return True
except ET.XMLSyntaxError as err:
print("PARSING ERROR:{0}".format(err))
return False
except AssertionError as err:
print("Incorrect XML schema: {0}".format(err))
return False
最后测试输出结果如下,确实检查出了XML不合法的项:
Incorrect XML schema: Element 'Record', attribute 'NUID': [facet 'pattern'] The value 'ZE2D6525' is not accepted by the pattern '([A-Fa-f0-9]{8})'., line 9