International Journal of

Arts , Humanities & Social Science

ISSN 2693-2547 (Print) , ISSN 2693-2555 (Online)
DOI: 10.56734/ijahss
Is AI Really Intelligent? Practical Insights from Real-World Use of Generative AI

Abstract


Generative AI systems based on Large Language Models (LLMs) have demonstrated remarkable capabilities in software engineering tasks, from code generation to natural language processing. However, a significant gap persists between curated demonstrations and production-grade deployment. This technical report presents a practitioner-driven analysis of LLM limitations encountered during sustained, real-world use across software development workflows. Drawing from multiple case studies—including natural-language-driven development (“vibe coding”), API integration, multi-file refactoring, and general-purpose question answering—we identify and taxonomize four critical failure modes: (1) the Complexity Cliff, where LLM performance degrades non-linearly as task interdependency grows; (2) Context Window Blindness, where finite attention spans cause silent contract violations across distributed codebases; (3) the Memory Illusion, where session discontinuity erases accumulated architectural knowledge; and (4) Confident Hallucination, where models generate plausible but fabricated outputs indistinguishable in tone from correct ones. We formalize the Verification Paradox—an inverse relationship between a user’s need for AI assistance and their capacity to validate its outputs—and propose a practical five-strategy framework for effective human–AI collaboration in software engineering contexts. We further introduce the concept of Contextual Reasoning Failure, evidenced by cases where LLMs optimize for literal query patterns while ignoring situational logic obvious to any human observer. Our findings suggest that current LLMs, while powerful pattern-matching engines, lack the contextual reasoning, persistent memory, and epistemic self-awareness necessary for reliable autonomous operation, and that practitioner expertise remains the critical safeguard against AI-induced defects in production systems.