Most studies that compare the quality of Neural Machine Translation (NMT) to that of Statistical Machine Translation (SMT) rely on automatic evaluation methods, mainly the bilingual evaluation understudy (BLEU), without performing any kind of human assessment. While BLEU is a good indicator of the overall performance of MT systems, it does not offer any detailed linguistic insights into the types of errors generated by those MT models. Such insights are crucial for researchers to identify areas for improvement and for language service providers to understand how upgrading to NMT gives them better results. This paper breaks free from BLEU by conducting an error analysis that compares the performance of Google SMT and NMT engines for English-into-Arabic translation. The corpus consists of six WikiHow articles. The analysis is guided by the DQF-MQM Harmonized Error Typology which classifies translation errors into eight major categories, namely, accuracy, fluency, terminology, style, design, locale convention, verity and other (for any other issues). A fine-grained classification of translation errors as such enables the researcher to explore the error types generated by each MT model, the error types eliminated by NMT, and the new error types introduced by NMT. The paper focuses on the English-Arabic language pair because it is one of the least studied pairs in the comparative literature of SMT and NMT. The results show that NMT generates less grammatical errors and mistranslations than SMT. NMT output is more fluent and robust. However, SMT is more consistent with translating proper nouns and out-of-vocabulary words.