Evaluating the reasoning abilities of AI language models in phonological and morphological problem solving. This study compared the reasoning abilities of five AI language models (ChatGPT-4, ChatGPT-4o, Claude 3.5 Sonnet, Gemini Advanced, and Llama 3.1) in analyzing phonological and morphological problems in linguistics. Novel questions on which the models had not been trained were created to determine whether they could solve problems through linguistic reasoning rather than memorization or dictionary-based knowledge. The problem set included English noun stress assignment rules, English past tense formation rules, basic morphological analyses, and advanced morphological analyses involving allomorphy rules in a hypothetical language. The models scored between 63.75 and 79.25 out of 100, with the best performance exhibited by Claude, followed by GPT-4o, Llama, GPT-4, and Gemini. Claude provided a metacognitive explanation that the answers derived through the strict application of the rules to the given data may differ from actual pronunciations. GPT-4o showed improved performance by self-correcting errors that occurred during the morphological analyses without additional prompting. However, the models often produced inconsistent analyses and blended the provided rules with the lexical knowledge on which they were pretrained. They scored particularly low on tasks involving morphemes with allomorphs, suggesting the considerable challenge confronting AI language models when capturing this feature of natural languages.