首页 > 代码库 > 可映射的CSV读取引擎
可映射的CSV读取引擎
由于许多项目都会使用csv文件来存储数据,因此我在这里介绍一套我自己设计出来的解决方案。有不合理的地方还望指出。
一般的csv文件读取都会比较繁琐:按照分隔符(默认逗号)分好行,列,再根据对应的顺序,一行一行,一条一条地读取数据。这本书没什么问题,然而一旦更改csv里的列顺序,或者增删某行就会产生牵一发动全身的结果。而且字段多的时候,写起来是非常反人类的。。
我们项目起初就是用的这种原始解决方案,也的确碰到了上面提及的尴尬局面。后来我想到,如果我能对csv表结构做好映射,像json,像xml那样,不就能大大提高效率?
于是我引出了如下的设计方案:
1. 准备
首先定义两条特性,一条是表整体结构相关的,一条是用来做字段映射
1 /// <summary> 2 /// CSV column mapping 3 /// </summary> 4 [AttributeUsage(AttributeTargets.Property | AttributeTargets.Field)] 5 public class CSVColumnAttribute : Attribute 6 { 7 /// <summary> 8 /// Name of this property/field in csv file(default is property name) 9 /// </summary>10 public string Key { get; set; }11 12 /// <summary>13 /// Column of this property/field in csv file(if column is assigned, key will be ignored)14 /// </summary>15 public int Column { get; set; }16 17 /// <summary>18 /// Default value(if reading NULL or failed;deault: -1 for number value, null for class, false for bool)19 /// </summary>20 public object DefaultValue { get; set; }21 22 /// <summary>23 /// Separator for parsing if it‘s an array(‘,‘ by default)24 /// </summary>25 public char ArraySeparator { get; set; }26 27 28 public CSVColumnAttribute()29 {30 Column = -1;31 ArraySeparator = ‘#‘;32 }33 34 public CSVColumnAttribute(string key)35 {36 Key = key;37 Column = -1;38 ArraySeparator = ‘#‘;39 }40 41 public CSVColumnAttribute(int column)42 {43 Column = column;44 ArraySeparator = ‘#‘;45 }46 }
1 /// <summary> 2 /// CSV Mapping class or struct(Try avoid struct as possible. Struct is boxed then unboxed in reflection) 3 /// </summary> 4 [AttributeUsage(AttributeTargets.Class | AttributeTargets.Struct)] 5 public class CSVMapperAttribute : Attribute 6 { 7 /// <summary> 8 /// Path of the CSV file(without file extension). Base directory is Assets/Resources/ 9 /// </summary>10 public string Path { get; set; }11 12 /// <summary>13 /// Mapping key row(0 by default)14 /// </summary>15 public int KeyRow { get; set; }16 17 /// <summary>18 /// Description row(1 by default. Will be skipped in decoding. If no desc in file, assign -1)19 /// </summary>20 public int DescRow { get; set; }21 22 /// <summary>23 /// Separator for csv parsing(‘,‘ by default)24 /// </summary>25 public char Separator { get; set; }26 27 /// <summary>28 /// Starting index of data rows29 /// </summary>30 public int StartRow { get; set; }31 32 public CSVMapperAttribute()33 {34 KeyRow = 0;35 DescRow = 1;36 Separator = ‘,‘;37 }38 39 public CSVMapperAttribute(string name)40 {41 Path = name;42 KeyRow = 0;43 DescRow = 1;44 Separator = ‘,‘;45 }46 }
表相关特性里的属性包括:CSV所在路径(可选),键值所在行(针对非英文表格),描述所在行(可选),分隔符(默认为逗号‘,‘),起始行(可选,解析时会跳过这之前的行)
字段映射相关特性的属性包括:键值,对应列号(键值和列号2选1即可。都不设置则默认键值为属性名),默认值(可选,字段解析失败会返回此默认值),数组分隔符(可选,默认为‘#‘,用来分隔数组)
CSVMapperAttribute可以添加到类或结构体上,CSVColumnAttribute可以添加到属性或字段上。
2. 读取和解析
1 public class CSVEngine 2 { 3 private List<List<string>> _records; 4 5 /// <summary> 6 /// Get column count 7 /// </summary> 8 public int ColumnCount { get; private set; } 9 10 /// <summary> 11 /// Get row count 12 /// </summary> 13 public int RowCount { get; private set; } 14 15 /// <summary> 16 /// Get separator 17 /// </summary> 18 public char Separator { get; private set; } 19 20 private int _keyRow = -1; 21 private int _descRow = -1; 22 private int _startRow = -1; 23 24 /// <summary> 25 /// Decode CSV file to target mapped type. 26 /// </summary> 27 /// <typeparam name="T"></typeparam> 28 /// <param name="path"></param> 29 /// <returns></returns> 30 public IEnumerable<T> Decode<T>() where T : new() 31 { 32 if (_records == null || _keyRow < 0 || _descRow < 0 || _startRow < 0) 33 { 34 Debug.LogError(string.Format("Decoding Failed: {0}", typeof (T))); 35 yield break; 36 } 37 38 //Decode each row 39 for (int i = _startRow; i < _records.Count; i++) 40 { 41 if (i == _keyRow || i == _descRow) 42 continue; 43 yield return DecodeRow<T>(_records[i], _records[_keyRow]); 44 } 45 } 46 47 /// <summary> 48 /// Decode single row 49 /// </summary> 50 /// <typeparam name="T"></typeparam> 51 /// <param name="fields"></param> 52 /// <param name="keys"></param> 53 /// <returns></returns> 54 private T DecodeRow<T>(List<string> fields, List<string> keys) where T : new() 55 { 56 T result = new T(); 57 IEnumerable<MemberInfo> members = 58 typeof (T).GetMembers() 59 .Where(m => m.MemberType == MemberTypes.Property || m.MemberType == MemberTypes.Field) 60 .Where(m => Attribute.IsDefined(m, typeof (CSVColumnAttribute), false)); 61 62 if (typeof (T).IsValueType) 63 { 64 object boxed = result; 65 foreach (MemberInfo member in members) 66 { 67 CSVColumnAttribute attribute = 68 member.GetCustomAttributes(typeof (CSVColumnAttribute), false).First() as CSVColumnAttribute; 69 string field = GetRawValue(attribute, fields, keys, member.Name); 70 if (ReferenceEquals(field, member.Name)) 71 return result; 72 SetValue(member, boxed, field, attribute.DefaultValue, attribute.ArraySeparator); 73 } 74 return (T) boxed; 75 } 76 77 foreach (MemberInfo member in members) 78 { 79 CSVColumnAttribute attribute = 80 member.GetCustomAttributes(typeof (CSVColumnAttribute), false).First() as CSVColumnAttribute; 81 string field = GetRawValue(attribute, fields, keys, member.Name); 82 if (ReferenceEquals(field, member.Name)) 83 return result; 84 SetValue(member, result, field, attribute.DefaultValue, attribute.ArraySeparator); 85 } 86 return result; 87 } 88 89 /// <summary> 90 /// Get raw value by CSVColumnAttribute or name 91 /// </summary> 92 /// <param name="attribute"></param> 93 /// <param name="fields"></param> 94 /// <param name="keys"></param> 95 /// <param name="name"></param> 96 /// <returns></returns> 97 private string GetRawValue(CSVColumnAttribute attribute, List<string> fields, List<string> keys, string name) 98 { 99 if (attribute.Column >= 0 && fields.Count > attribute.Column)100 {101 return fields[attribute.Column];102 }103 if (!string.IsNullOrEmpty(attribute.Key) && keys.Contains(attribute.Key))104 {105 return fields[keys.IndexOf(attribute.Key)];106 }107 if (keys.Contains(name))108 {109 return fields[keys.IndexOf(name)];110 }111 Debug.LogError(string.Format("Mapping Error! Column: {0}, Key: {1}, Name:{2}", attribute.Column,112 attribute.Key ?? "NULL", name));113 return name;114 }115 116 /// <summary>117 /// Parse and set raw value118 /// </summary>119 /// <param name="member"></param>120 /// <param name="obj"></param>121 /// <param name="value"></param>122 /// <param name="defaultValue"></param>123 /// <param name="arraySeparator"></param>124 private void SetValue(MemberInfo member, object obj, string value, object defaultValue, char arraySeparator)125 {126 if (member.MemberType == MemberTypes.Property)127 {128 (member as PropertyInfo).SetValue(obj,129 ParseRawValue(value, (member as PropertyInfo).PropertyType, defaultValue, arraySeparator),130 null);131 }132 else133 {134 (member as FieldInfo).SetValue(obj,135 ParseRawValue(value, (member as FieldInfo).FieldType, defaultValue, arraySeparator));136 }137 }138 139 /// <summary>140 /// Parse string value to specified type141 /// </summary>142 /// <param name="field"></param>143 /// <param name="type">If type is collection, use array only(e.g. int[])</param>144 /// <param name="defaultValue">If type is collection, use element default(e.g. 0 for int[])</param>145 /// <param name="arraySeparator"></param>146 /// <returns></returns>147 private object ParseRawValue(string field, Type type, object defaultValue, char arraySeparator)148 {149 try150 {151 if (type.IsArray)152 {153 IEnumerable<object> result =154 field.Split(arraySeparator)155 .Select(f => ParseRawValue(f, type.GetElementType(), defaultValue, arraySeparator));156 if (type.GetElementType() == typeof (string))157 {158 return result.Cast<string>().ToArray();159 }160 if (type.GetElementType() == typeof (int))161 {162 return result.Cast<int>().ToArray();163 }164 if (type.GetElementType() == typeof (float))165 {166 return result.Cast<float>().ToArray();167 }168 if (type.GetElementType() == typeof (double))169 {170 return result.Cast<double>().ToArray();171 }172 if (type.GetElementType() == typeof (bool))173 {174 return result.Cast<bool>().ToArray();175 }176 return null;177 }178 if (type == typeof (string))179 {180 return field;181 }182 if (type == typeof (int))183 {184 return Convert.ToInt32(field);185 }186 if (type == typeof (float))187 {188 return Convert.ToSingle(field);189 }190 if (type == typeof (double))191 {192 return Convert.ToDouble(field);193 }194 if (type == typeof (bool))195 {196 if (field == null)197 {198 return false;199 }200 field = field.Trim();201 return field.Equals("true", StringComparison.CurrentCultureIgnoreCase) || field.Equals("1");202 }203 }204 catch (FormatException ex)205 {206 Debug.LogWarning(string.Format("{0}: {1} -> {2}", ex.Message, field, type));207 208 //In case default value is null but the property/field is not a reference type209 if (defaultValue =http://www.mamicode.com/= null)210 {211 if (type == typeof (int) || type == typeof (float) || type == typeof (double))212 {213 defaultValue = http://www.mamicode.com/-1;214 }215 else if (type == typeof (bool))216 {217 defaultValue = http://www.mamicode.com/false;218 }219 }220 }221 222 return defaultValue;223 }224 225 /// <summary>226 /// Load CSV into record list. If you need to decode records, use Decode(path) instead.227 /// </summary>228 /// <param name="path"></param>229 /// <param name="separator"></param>230 public bool Load(string path, char separator = ‘,‘)231 {232 //Dispose records233 ClearRecord();234 235 if (string.IsNullOrEmpty(path))236 {237 Debug.LogError(string.Format("CSV path not found: {0}", path));238 return false;239 }240 241 //Read text242 TextAsset asset = Resources.Load<TextAsset>(path);243 244 if (asset == null)245 {246 Debug.LogError(string.Format("CSV file not found: {0}", path));247 return false;248 }249 250 string content = asset.text;251 if (string.IsNullOrEmpty(content))252 {253 Debug.LogError(string.Format("CSV file content empty: {0}", path));254 return false;255 }256 257 Separator = separator;258 _records = new List<List<string>>();259 foreach (string row in content.Split(‘\r‘).Where(line => !string.IsNullOrEmpty(line.Trim())))260 {261 List<string> columns = row.Split(separator).Select(s => s.Trim()).ToList();262 //Check each row‘s column count. They must match263 if (ColumnCount != 0 && columns.Count != ColumnCount)264 {265 Debug.LogError(266 string.Format("CSV parsing error in {0} at line {1} : columns counts do not match! Separator: ‘{2}‘", path,267 content.IndexOf(row), separator));268 return false;269 }270 ColumnCount = columns.Count;271 _records.Add(columns);272 }273 RowCount = _records.Count;274 275 if (_records == null || !_records.Any())276 {277 Debug.LogWarning(string.Format("CSV file parsing failed(empty records): {0}", path));278 return false;279 }280 281 return true;282 }283 284 public bool Load<T>()285 {286 ClearRecord();287 288 //Check mapping289 if (!Attribute.IsDefined(typeof (T), typeof (CSVMapperAttribute), false))290 {291 Debug.LogError(string.Format("CSV mapping not found in type: {0}", typeof (T)));292 return false;293 }294 295 CSVMapperAttribute mapper =296 Attribute.GetCustomAttribute(typeof (T), typeof (CSVMapperAttribute), false) as CSVMapperAttribute;297 _keyRow = mapper.KeyRow;298 _descRow = mapper.DescRow;299 _startRow = mapper.StartRow;300 301 bool result = Load(mapper.Path, mapper.Separator);302 if (result)303 {304 if (_records[_keyRow].Any(string.IsNullOrEmpty))305 {306 Debug.LogError(307 string.Format("Encoding Error! No key column found. Make sure target file is in UTF-8 format. Path: {0}",308 mapper.Path));309 return false;310 }311 }312 return result;313 }314 315 /// <summary>316 /// Get string value at specified row and column. If record empty or position not found, NULL will be returned. Row/Column starts at 0317 /// </summary>318 /// <param name="row"></param>319 /// <param name="column"></param>320 /// <returns></returns>321 public string this[int row, int column]322 {323 get324 {325 if (_records == null || _records.Count <= row || _records[row].Count <= column)326 {327 return null;328 }329 return _records[row][column];330 }331 }332 333 /// <summary>334 /// Get a converted value at specified row and column. If record empty or position not found or convertion failed, defaultValue will be returned. Row/Column starts at 0335 /// </summary>336 /// <typeparam name="T">If T is collection, use array only(e.g. int[])</typeparam>337 /// <param name="row"></param>338 /// <param name="column"></param>339 /// <param name="defaultValue">If T is collection, use element default(e.g. 0 for int[])</param>340 /// <param name="arraySeparator"></param>341 /// <returns></returns>342 public T Read<T>(int row, int column, object defaultValue, char arraySeparator = ‘#‘)343 {344 string field = this[row, column];345 if (field == null)346 {347 Debug.LogWarning("Field is null. Make sure csv is loaded and field has content.");348 return typeof (T).IsArray ? default(T) : (T) defaultValue;349 }350 351 return (T) ParseRawValue(field, typeof (T), defaultValue, arraySeparator);352 }353 354 355 /// <summary>356 /// Remove all records.357 /// </summary>358 public void ClearRecord()359 {360 _records = null;361 }362 }
看起来比较复杂?我们用例子来讲解:
添加一个表结构类
1 [CSVMapper("Configs/Resource")]2 public class ResourceData : Data3 {4 [CSVColumn(0)] public int ID;5 [CSVColumn(1)] public string Path;6 [CSVColumn(2)] public float Ratio;7 [CSVColumn(3)] public string Desc;8 }
添加一个根据结构类读表的方法
1 /// <summary> 2 /// Get table 3 /// </summary> 4 /// <typeparam name="T"></typeparam> 5 /// <returns></returns> 6 private IEnumerable<T> GetTable<T>() where T : Data, new() 7 { 8 CSVReaderX reader = new CSVReaderX(); 9 if (reader.Load<T>())10 {11 Debug.Log(string.Format("{0} Loaded", typeof (T)));12 return reader.Decode<T>();13 }14 15 return null;16 }
注意,这里让ResourceData继承Data,并且在GetTable里做了泛型约束是为了规范使用,并无其他意义
Data结构如下
1 /// <summary>2 /// All table class must inherit this for constraint3 /// </summary>4 public abstract class Data5 {6 }
Resource.csv的内容如下:
资源ID,资源路径,缩放比例,说明
int,string,float,string
10001,Model/a,1,
10002,Model/b,1,
10003,Model/c,1,
10004,Model/d,1,
10005,Model/e,1,
10006,Model/f,1,
10007,Model/g,1,
还可以直接用键值索引:
[CSVMapper("Configs/Resource")] public class ResourceData : Data { [CSVColumn(“资源ID”)] public int ID; [CSVColumn(“资源路径”)] public string Path; [CSVColumn(“缩放比例”)] public float Ratio; [CSVColumn(“说明”)] public string Desc; }
第二行(int,string,float,string)其实没什么意义,因此他被当作Desc行(描述行)。
使用延迟实例化加载表格并存储为字典,即可进行键值索引
public Dictionary<int, ResourceData> ResourceDict { get { return _resourceDict ?? (_resourceDict = GetTable<ResourceData>().ToDictionary(k => k.ID)); } }
var data = http://www.mamicode.com/ResourceDict[0];
以上是映射好表结构后自动加载的结果。
我还额外提供了手动解析的接口:
手动Load
public bool Load(string path, char separator = ‘,‘);
手动Read
public T Read<T>(int row, int column, object defaultValue, char arraySeparator = ‘#‘);
或者通过索引器获得string类型的值再自己解析
1 CSVReaderX reader = new CSVReaderX();2 3 reader.Load("Path");4 int val = reader.Read<int>(0, 0, 0);5 int[] vals = reader.Read<int[]>(0, 0, null);6 string raw = reader[0, 0];
注意,行和列都是从0开始算。
路径因为我这里是Unity3D的项目,所以映射的路径是Resources下不带后缀的路径,且Load方法里用的是Resources.Load方式来读取资源。其他平台的项目做相应修改即可~
集合字段只能用逗号之外的分隔符(默认‘#‘),且只能为数组类型
1 [CSVMapper("Configs/Skill")]2 public class SkillData : Data3 {4 [CSVColumn(0)] public int ID;5 [CSVColumn(1)] public int Name;6 [CSVColumn(2)] public int[] SkillIDs;7 }
有问题欢迎探讨。
源码参见我的github:
https://github.com/theoxuan/MTGeek/blob/master/Assets/Scripts/CSVReaderX.cs
可映射的CSV读取引擎