首页 > 代码库 > 可映射的CSV读取引擎

可映射的CSV读取引擎

由于许多项目都会使用csv文件来存储数据,因此我在这里介绍一套我自己设计出来的解决方案。有不合理的地方还望指出。

一般的csv文件读取都会比较繁琐:按照分隔符(默认逗号)分好行,列,再根据对应的顺序,一行一行,一条一条地读取数据。这本书没什么问题,然而一旦更改csv里的列顺序,或者增删某行就会产生牵一发动全身的结果。而且字段多的时候,写起来是非常反人类的。。

我们项目起初就是用的这种原始解决方案,也的确碰到了上面提及的尴尬局面。后来我想到,如果我能对csv表结构做好映射,像json,像xml那样,不就能大大提高效率?

于是我引出了如下的设计方案:

 

1. 准备

首先定义两条特性,一条是表整体结构相关的,一条是用来做字段映射

 1 /// <summary> 2     /// CSV column mapping 3     /// </summary> 4     [AttributeUsage(AttributeTargets.Property | AttributeTargets.Field)] 5     public class CSVColumnAttribute : Attribute 6     { 7         /// <summary> 8         /// Name of this property/field in csv file(default is property name) 9         /// </summary>10         public string Key { get; set; }11 12         /// <summary>13         /// Column of this property/field in csv file(if column is assigned, key will be ignored)14         /// </summary>15         public int Column { get; set; }16 17         /// <summary>18         /// Default value(if reading NULL or failed;deault: -1 for number value, null for class, false for bool)19         /// </summary>20         public object DefaultValue { get; set; }21 22         /// <summary>23         /// Separator for parsing if it‘s an array(‘,‘ by default)24         /// </summary>25         public char ArraySeparator { get; set; }26 27 28         public CSVColumnAttribute()29         {30             Column = -1;31             ArraySeparator = #;32         }33 34         public CSVColumnAttribute(string key)35         {36             Key = key;37             Column = -1;38             ArraySeparator = #;39         }40 41         public CSVColumnAttribute(int column)42         {43             Column = column;44             ArraySeparator = #;45         }46     }
 1 /// <summary> 2     /// CSV Mapping class or struct(Try avoid struct as possible. Struct is boxed then unboxed in reflection) 3     /// </summary> 4     [AttributeUsage(AttributeTargets.Class | AttributeTargets.Struct)] 5     public class CSVMapperAttribute : Attribute 6     { 7         /// <summary> 8         /// Path of the CSV file(without file extension). Base directory is Assets/Resources/ 9         /// </summary>10         public string Path { get; set; }11 12         /// <summary>13         /// Mapping key row(0 by default)14         /// </summary>15         public int KeyRow { get; set; }16 17         /// <summary>18         /// Description row(1 by default. Will be skipped in decoding. If no desc in file, assign -1)19         /// </summary>20         public int DescRow { get; set; }21 22         /// <summary>23         /// Separator for csv parsing(‘,‘ by default)24         /// </summary>25         public char Separator { get; set; }26 27         /// <summary>28         /// Starting index of data rows29         /// </summary>30         public int StartRow { get; set; }31 32         public CSVMapperAttribute()33         {34             KeyRow = 0;35             DescRow = 1;36             Separator = ,;37         }38 39         public CSVMapperAttribute(string name)40         {41             Path = name;42             KeyRow = 0;43             DescRow = 1;44             Separator = ,;45         }46     }

表相关特性里的属性包括:CSV所在路径(可选),键值所在行(针对非英文表格),描述所在行(可选),分隔符(默认为逗号‘,‘),起始行(可选,解析时会跳过这之前的行)

字段映射相关特性的属性包括:键值,对应列号(键值和列号2选1即可。都不设置则默认键值为属性名),默认值(可选,字段解析失败会返回此默认值),数组分隔符(可选,默认为‘#‘,用来分隔数组)

CSVMapperAttribute可以添加到类或结构体上,CSVColumnAttribute可以添加到属性或字段上。

2. 读取和解析

技术分享
  1 public class CSVEngine  2     {  3         private List<List<string>> _records;  4   5         /// <summary>  6         /// Get column count  7         /// </summary>  8         public int ColumnCount { get; private set; }  9  10         /// <summary> 11         /// Get row count 12         /// </summary> 13         public int RowCount { get; private set; } 14  15         /// <summary> 16         /// Get separator 17         /// </summary> 18         public char Separator { get; private set; } 19  20         private int _keyRow = -1; 21         private int _descRow = -1; 22         private int _startRow = -1; 23  24         /// <summary> 25         /// Decode CSV file to target mapped type. 26         /// </summary> 27         /// <typeparam name="T"></typeparam> 28         /// <param name="path"></param> 29         /// <returns></returns> 30         public IEnumerable<T> Decode<T>() where T : new() 31         { 32             if (_records == null || _keyRow < 0 || _descRow < 0 || _startRow < 0) 33             { 34                 Debug.LogError(string.Format("Decoding Failed: {0}", typeof (T))); 35                 yield break; 36             } 37  38             //Decode each row 39             for (int i = _startRow; i < _records.Count; i++) 40             { 41                 if (i == _keyRow || i == _descRow) 42                     continue; 43                 yield return DecodeRow<T>(_records[i], _records[_keyRow]); 44             } 45         } 46  47         /// <summary> 48         /// Decode single row 49         /// </summary> 50         /// <typeparam name="T"></typeparam> 51         /// <param name="fields"></param> 52         /// <param name="keys"></param> 53         /// <returns></returns> 54         private T DecodeRow<T>(List<string> fields, List<string> keys) where T : new() 55         { 56             T result = new T(); 57             IEnumerable<MemberInfo> members = 58                 typeof (T).GetMembers() 59                     .Where(m => m.MemberType == MemberTypes.Property || m.MemberType == MemberTypes.Field) 60                     .Where(m => Attribute.IsDefined(m, typeof (CSVColumnAttribute), false)); 61  62             if (typeof (T).IsValueType) 63             { 64                 object boxed = result; 65                 foreach (MemberInfo member in members) 66                 { 67                     CSVColumnAttribute attribute = 68                         member.GetCustomAttributes(typeof (CSVColumnAttribute), false).First() as CSVColumnAttribute; 69                     string field = GetRawValue(attribute, fields, keys, member.Name); 70                     if (ReferenceEquals(field, member.Name)) 71                         return result; 72                     SetValue(member, boxed, field, attribute.DefaultValue, attribute.ArraySeparator); 73                 } 74                 return (T) boxed; 75             } 76  77             foreach (MemberInfo member in members) 78             { 79                 CSVColumnAttribute attribute = 80                     member.GetCustomAttributes(typeof (CSVColumnAttribute), false).First() as CSVColumnAttribute; 81                 string field = GetRawValue(attribute, fields, keys, member.Name); 82                 if (ReferenceEquals(field, member.Name)) 83                     return result; 84                 SetValue(member, result, field, attribute.DefaultValue, attribute.ArraySeparator); 85             } 86             return result; 87         } 88  89         /// <summary> 90         /// Get raw value by CSVColumnAttribute or name 91         /// </summary> 92         /// <param name="attribute"></param> 93         /// <param name="fields"></param> 94         /// <param name="keys"></param> 95         /// <param name="name"></param> 96         /// <returns></returns> 97         private string GetRawValue(CSVColumnAttribute attribute, List<string> fields, List<string> keys, string name) 98         { 99             if (attribute.Column >= 0 && fields.Count > attribute.Column)100             {101                 return fields[attribute.Column];102             }103             if (!string.IsNullOrEmpty(attribute.Key) && keys.Contains(attribute.Key))104             {105                 return fields[keys.IndexOf(attribute.Key)];106             }107             if (keys.Contains(name))108             {109                 return fields[keys.IndexOf(name)];110             }111             Debug.LogError(string.Format("Mapping Error! Column: {0}, Key: {1}, Name:{2}", attribute.Column,112                 attribute.Key ?? "NULL", name));113             return name;114         }115 116         /// <summary>117         /// Parse and set raw value118         /// </summary>119         /// <param name="member"></param>120         /// <param name="obj"></param>121         /// <param name="value"></param>122         /// <param name="defaultValue"></param>123         /// <param name="arraySeparator"></param>124         private void SetValue(MemberInfo member, object obj, string value, object defaultValue, char arraySeparator)125         {126             if (member.MemberType == MemberTypes.Property)127             {128                 (member as PropertyInfo).SetValue(obj,129                     ParseRawValue(value, (member as PropertyInfo).PropertyType, defaultValue, arraySeparator),130                     null);131             }132             else133             {134                 (member as FieldInfo).SetValue(obj,135                     ParseRawValue(value, (member as FieldInfo).FieldType, defaultValue, arraySeparator));136             }137         }138 139         /// <summary>140         /// Parse string value to specified type141         /// </summary>142         /// <param name="field"></param>143         /// <param name="type">If type is collection, use array only(e.g. int[])</param>144         /// <param name="defaultValue">If type is collection, use element default(e.g. 0 for int[])</param>145         /// <param name="arraySeparator"></param>146         /// <returns></returns>147         private object ParseRawValue(string field, Type type, object defaultValue, char arraySeparator)148         {149             try150             {151                 if (type.IsArray)152                 {153                     IEnumerable<object> result =154                         field.Split(arraySeparator)155                             .Select(f => ParseRawValue(f, type.GetElementType(), defaultValue, arraySeparator));156                     if (type.GetElementType() == typeof (string))157                     {158                         return result.Cast<string>().ToArray();159                     }160                     if (type.GetElementType() == typeof (int))161                     {162                         return result.Cast<int>().ToArray();163                     }164                     if (type.GetElementType() == typeof (float))165                     {166                         return result.Cast<float>().ToArray();167                     }168                     if (type.GetElementType() == typeof (double))169                     {170                         return result.Cast<double>().ToArray();171                     }172                     if (type.GetElementType() == typeof (bool))173                     {174                         return result.Cast<bool>().ToArray();175                     }176                     return null;177                 }178                 if (type == typeof (string))179                 {180                     return field;181                 }182                 if (type == typeof (int))183                 {184                     return Convert.ToInt32(field);185                 }186                 if (type == typeof (float))187                 {188                     return Convert.ToSingle(field);189                 }190                 if (type == typeof (double))191                 {192                     return Convert.ToDouble(field);193                 }194                 if (type == typeof (bool))195                 {196                     if (field == null)197                     {198                         return false;199                     }200                     field = field.Trim();201                     return field.Equals("true", StringComparison.CurrentCultureIgnoreCase) || field.Equals("1");202                 }203             }204             catch (FormatException ex)205             {206                 Debug.LogWarning(string.Format("{0}: {1} -> {2}", ex.Message, field, type));207 208                 //In case default value is null but the property/field is not a reference type209                 if (defaultValue =http://www.mamicode.com/= null)210                 {211                     if (type == typeof (int) || type == typeof (float) || type == typeof (double))212                     {213                         defaultValue = http://www.mamicode.com/-1;214                     }215                     else if (type == typeof (bool))216                     {217                         defaultValue = http://www.mamicode.com/false;218                     }219                 }220             }221 222             return defaultValue;223         }224 225         /// <summary>226         /// Load CSV into record list. If you need to decode records, use Decode(path) instead.227         /// </summary>228         /// <param name="path"></param>229         /// <param name="separator"></param>230         public bool Load(string path, char separator = ,)231         {232             //Dispose records233             ClearRecord();234 235             if (string.IsNullOrEmpty(path))236             {237                 Debug.LogError(string.Format("CSV path not found: {0}", path));238                 return false;239             }240 241             //Read text242             TextAsset asset = Resources.Load<TextAsset>(path);243 244             if (asset == null)245             {246                 Debug.LogError(string.Format("CSV file not found: {0}", path));247                 return false;248             }249 250             string content = asset.text;251             if (string.IsNullOrEmpty(content))252             {253                 Debug.LogError(string.Format("CSV file content empty: {0}", path));254                 return false;255             }256 257             Separator = separator;258             _records = new List<List<string>>();259             foreach (string row in content.Split(\r).Where(line => !string.IsNullOrEmpty(line.Trim())))260             {261                 List<string> columns = row.Split(separator).Select(s => s.Trim()).ToList();262                 //Check each row‘s column count. They must match263                 if (ColumnCount != 0 && columns.Count != ColumnCount)264                 {265                     Debug.LogError(266                         string.Format("CSV parsing error in {0} at line {1} : columns counts do not match! Separator: ‘{2}‘", path,267                             content.IndexOf(row), separator));268                     return false;269                 }270                 ColumnCount = columns.Count;271                 _records.Add(columns);272             }273             RowCount = _records.Count;274 275             if (_records == null || !_records.Any())276             {277                 Debug.LogWarning(string.Format("CSV file parsing failed(empty records): {0}", path));278                 return false;279             }280 281             return true;282         }283 284         public bool Load<T>()285         {286             ClearRecord();287 288             //Check mapping289             if (!Attribute.IsDefined(typeof (T), typeof (CSVMapperAttribute), false))290             {291                 Debug.LogError(string.Format("CSV mapping not found in type: {0}", typeof (T)));292                 return false;293             }294 295             CSVMapperAttribute mapper =296                 Attribute.GetCustomAttribute(typeof (T), typeof (CSVMapperAttribute), false) as CSVMapperAttribute;297             _keyRow = mapper.KeyRow;298             _descRow = mapper.DescRow;299             _startRow = mapper.StartRow;300 301             bool result = Load(mapper.Path, mapper.Separator);302             if (result)303             {304                 if (_records[_keyRow].Any(string.IsNullOrEmpty))305                 {306                     Debug.LogError(307                         string.Format("Encoding Error! No key column found. Make sure target file is in UTF-8 format. Path: {0}",308                             mapper.Path));309                     return false;310                 }311             }312             return result;313         }314 315         /// <summary>316         /// Get string value at specified row and column. If record empty or position not found, NULL will be returned. Row/Column starts at 0317         /// </summary>318         /// <param name="row"></param>319         /// <param name="column"></param>320         /// <returns></returns>321         public string this[int row, int column]322         {323             get324             {325                 if (_records == null || _records.Count <= row || _records[row].Count <= column)326                 {327                     return null;328                 }329                 return _records[row][column];330             }331         }332 333         /// <summary>334         /// Get a converted value at specified row and column. If record empty or position not found or convertion failed, defaultValue will be returned. Row/Column starts at 0335         /// </summary>336         /// <typeparam name="T">If T is collection, use array only(e.g. int[])</typeparam>337         /// <param name="row"></param>338         /// <param name="column"></param>339         /// <param name="defaultValue">If T is collection, use element default(e.g. 0 for int[])</param>340         /// <param name="arraySeparator"></param>341         /// <returns></returns>342         public T Read<T>(int row, int column, object defaultValue, char arraySeparator = #)343         {344             string field = this[row, column];345             if (field == null)346             {347                 Debug.LogWarning("Field is null. Make sure csv is loaded and field has content.");348                 return typeof (T).IsArray ? default(T) : (T) defaultValue;349             }350 351             return (T) ParseRawValue(field, typeof (T), defaultValue, arraySeparator);352         }353 354 355         /// <summary>356         /// Remove all records.357         /// </summary>358         public void ClearRecord()359         {360             _records = null;361         }362     }
CSVEngine

 

看起来比较复杂?我们用例子来讲解:

添加一个表结构类

1 [CSVMapper("Configs/Resource")]2     public class ResourceData : Data3     {4         [CSVColumn(0)] public int ID;5         [CSVColumn(1)] public string Path;6         [CSVColumn(2)] public float Ratio;7         [CSVColumn(3)] public string Desc;8     }

添加一个根据结构类读表的方法

 1 /// <summary> 2     /// Get table 3     /// </summary> 4     /// <typeparam name="T"></typeparam> 5     /// <returns></returns> 6     private IEnumerable<T> GetTable<T>() where T : Data, new() 7     { 8         CSVReaderX reader = new CSVReaderX(); 9         if (reader.Load<T>())10         {11             Debug.Log(string.Format("{0} Loaded", typeof (T)));12             return reader.Decode<T>();13         }14 15         return null;16     }

注意,这里让ResourceData继承Data,并且在GetTable里做了泛型约束是为了规范使用,并无其他意义

Data结构如下

1 /// <summary>2     /// All table class must inherit this for constraint3     /// </summary>4     public abstract class Data5     {6     }

Resource.csv的内容如下:

资源ID,资源路径,缩放比例,说明
int,string,float,string
10001,Model/a,1,
10002,Model/b,1,
10003,Model/c,1,
10004,Model/d,1,
10005,Model/e,1,
10006,Model/f,1,
10007,Model/g,1,

还可以直接用键值索引:

[CSVMapper("Configs/Resource")]    public class ResourceData : Data    {        [CSVColumn(“资源ID”)] public int ID;        [CSVColumn(“资源路径”)] public string Path;        [CSVColumn(“缩放比例”)] public float Ratio;        [CSVColumn(“说明”)] public string Desc;    }

第二行(int,string,float,string)其实没什么意义,因此他被当作Desc行(描述行)。

使用延迟实例化加载表格并存储为字典,即可进行键值索引

public Dictionary<int, ResourceData> ResourceDict    {        get        {            return _resourceDict ?? (_resourceDict = GetTable<ResourceData>().ToDictionary(k => k.ID));        }    }
var data = http://www.mamicode.com/ResourceDict[0];

 

以上是映射好表结构后自动加载的结果。

我还额外提供了手动解析的接口:

手动Load

public bool Load(string path, char separator = ,);

手动Read

public T Read<T>(int row, int column, object defaultValue, char arraySeparator = #);

或者通过索引器获得string类型的值再自己解析

1         CSVReaderX reader = new CSVReaderX();2 3         reader.Load("Path");4         int val = reader.Read<int>(0, 0, 0);5         int[] vals = reader.Read<int[]>(0, 0, null);6         string raw = reader[0, 0];

 

注意,行和列都是从0开始算。

路径因为我这里是Unity3D的项目,所以映射的路径是Resources下不带后缀的路径,且Load方法里用的是Resources.Load方式来读取资源。其他平台的项目做相应修改即可~

集合字段只能用逗号之外的分隔符(默认‘#‘),且只能为数组类型

1     [CSVMapper("Configs/Skill")]2     public class SkillData : Data3     {4         [CSVColumn(0)] public int ID;5         [CSVColumn(1)] public int Name;6         [CSVColumn(2)] public int[] SkillIDs;7     }

 

有问题欢迎探讨。

 

源码参见我的github:

https://github.com/theoxuan/MTGeek/blob/master/Assets/Scripts/CSVReaderX.cs

 

可映射的CSV读取引擎