ku, USA
51 days ago
(AIG Sonpo) IT Support Manager
English follows Japanese 職務の役割 Job Purpose ITサポートマネージャーは、アプリケーションサポートチームにおける中核的なポジションであり、業務アプリケーションの安定稼働およびパフォーマンスを担います。本ポジションは、迅速な障害対応、原因分析(Root Cause Analysis)、および開発チームとの連携による改善・修正を通じて、高い可用性とユーザー満足度を確保します。開発チームと運用チームの橋渡し役として、信頼性に対する共通の責任意識と協業体制を構築します。サービスレベル目標(SLO)の定義・達成、エラーバジェット管理、ブレームレス・ポストモーテム(責任追及をしない振り返り)を実施し、継続的改善を推進します。最終的には、ソフトウェア開発のスピードとシステム安定性のバランスを最適化し、シームレスなユーザー体験の実現を目指します。 主な職務内容 Job Responsibilities • 重要な業務アプリケーションおよびサービスの継続的な稼働率・可用性の維持(システムパフォーマンスの監視、潜在的問題の検知、ダウンタイム防止策の実施)• 障害・インシデント発生時の迅速な対応および復旧対応(トラブルシューティング、関係チームとの連携、サービス復旧)• AI/機械学習を活用したログ分析・ユーザー行動分析により、問題の予測および予防的対応を実施• サポート組織におけるAI活用推進のリーダー役(トレーニング実施、ドキュメント作成、ベストプラクティス確立)• 手作業・繰り返し作業(Toil)の自動化による業務効率化およびヒューマンエラー削減• 監視・アラート基盤の構築・運用により、リアルタイムでのシステム状態把握を実現• 大規模障害発生後のブレームレス・ポストモーテムの実施(原因分析、学習事項の文書化、再発防止策の導入)• プロダクト、エンジニアリング、ビジネス部門などのクロスファンクショナルチームとの連携(AI活用による業務効率化・顧客体験向上・ビジネス価値創出の提案)• サービスレベル指標(SLI)・目標(SLO)の設定および管理(高いユーザー満足度を維持するための開発・運用優先順位付け)• 最新のAI技術・業界トレンドの把握および新技術の検証・導入• 日本のアプリケーションサポートチームを代表し、BAU関連プロジェクトや施策をリード 主な関係先  Key Relationships 社内(Within the Organization) • 各ビジネス部門• グローバルプロダクションサポート• アプリケーションデリバリー• 情報セキュリティ• テクノロジーリスク• インフラストラクチャーサービス 社外(Outside the Organization) • 日本損害保険協会• プロダクトベンダー/保守・サービス提供ベンダー• データ連携を行う外部機関 必要なスキル・経験 Required Skills and Experiences 学歴 + 関連分野の学士号 必須スキル・資格 • AI/機械学習の基本概念(アルゴリズム、モデル学習、デプロイ)への深い理解•スクリプト言語(Python、Bash 等)および IaC ツール(Terraform、Ansible 等)の実務経験•エンジニアリング志向の問題解決能力(短期対応だけでなく、長期的改善を重視) •障害の迅速な切り分け・復旧および再発防止策の実行能力 (インシデント対応計画、PIR 実施) • メトリクス、ログ等のデータを活用したシステム分析・信頼性向上能力 • AI活用ツール(インテリジェントチャットボット、自律型サポートエージェント等)の設計・導入・運用経験 • 優れたコミュニケーション能力(日本語・英語双方での技術説明、関係者調整が可能) • 新技術・新手法への積極的な学習姿勢 • 複数タスクの同時進行および優先順位管理能力 経験年数 • 3年以上の関連 IT 経験(地域/チーム単位での責任・リーダーシップ経験を含む)• 5年以上のチームリードまたはプロジェクトマネジメント経験• IT サービスマネジメント(ITSM)および AI活用型オブザーバビリティツール の実務経験• GPT、Claude などの大規模言語モデルおよび関連フレームワーク(RAG 等)の実務経験•本番環境の運用、障害対応、オンコール対応経験(実運用に基づく知見を重視) Job Purpose The IT Support Manager is a critical role in our application support team, responsible for the operational health and performance of our applications. This role ensures high application availability and user satisfaction by swiftly resolving issues, performing root-cause analysis, and collaborating with our development team on fixes and improvements. You will bridge the gap between development and operations teams, fostering collaboration and shared ownership of reliability. Key responsibilities include defining and meeting Service Level Objectives (SLOs), managing error budgets, and conducting blameless postmortems for continuous improvement. Ultimately, strive to achieve a balance between the speed of software development and system stability, ensuring a seamless user experience Job Responsibilities + Keep up continuous uptime and accessibility of critical business applications and services. This involves actively monitoring system performance, detecting potential issues, and implementing strategies to prevent downtime. + Respond to and resolve incidents and outages promptly. This includes troubleshooting problems, coordinating with other teams, and restoring service quickly. + Utilize AI and machine learning to analyze application logs and user behavior patterns, predicting potential issues and implementing proactive measures to prevent disruptions and performance degradation + Champion the adoption of AI technologies across the support organization. Provide training, create documentation, and establish best practices to upskill team members and foster a culture of AI innovation. + Automate repetitive, manual tasks (toil) to improve efficiency and reduce human error. This might involve scripting, developing tools, and improving infrastructure management processes. + Establish and maintain robust monitoring and alerting systems to gain real-time insights into system health and performance. This allows for proactive identification and detection of anomalies or potential issues. + After major incidents causing outages, conduct blameless post-mortem reviews to analyze the root causes of failures, document learnings, and implement corrective measures to prevent future occurrences. + Work with cross-functional teams, including product, engineering, and business stakeholders, to identify high-impact opportunities for AI integration. Clearly articulate how AI solutions will improve efficiency, enhance the customer experience, and deliver measurable business outcomes. + Establish clear, measurable targets for system performance and reliability, often based on Service Level Indicators (SLIs). These Service Level indicators and objectives guide development and operations priorities to maintain high levels of user satisfaction. + Stay informed on emerging AI technologies and industry trends. Evaluate and pilot new AI solutions to continuously enhance application support processes and capabilities. + Lead BAU projects and initiatives representing Japan application support team. Key Relationships Internal Interactions (Within the Organization) + All Business areas + Global Production Support + Application Delivery + Information Security + Technology Risk + Infrastructure Services External Interactions (Outside the Organization) + General Insurance Association of Japan + Product vendors, Vendors provide maintenance/services + External organizations exchanging data with Required Skills and Experiences Educational Qualification + Bachelor’s degree in related field Specific Qualifications + Strong understanding of core concepts in AI and machine learning, including algorithms, model training, and deployment. + Proficiency in scripting languages (e.g., Python, Bash) and Infrastructure as Code (IaC) tools (e.g., Terraform, Ansible) is crucial. + A strong, engineering-minded approach to solving problems, with a focus on system improvement and long-term strategic impact. + Ability to quickly diagnose and resolve system incidents, minimize downtime, and implement solutions to prevent recurrence is paramount. This includes developing and adhering to incident response plans and conducting post-incident reviews (PIRs) + Ability to rely on data from metrics, logs, and other sources to understand system behavior, analyze performance, identify trends, and make informed decisions to improve system reliability. + Design, implement, and manage AI-driven tools, such as intelligent chatbots and autonomous support agents, to automate routine technical support tasks and empower end-users with self-service capabilities. + Excellent communication skills to articulate technical concepts, collaborate on projects, and foster a shared understanding of reliability goals. (Both in Japanese and English) + Proactive in learning new technologies, methodologies, and tools to adapt to changing environments and continuously improve their skills and the systems they manage. + Ability to handle multiple tasks, prioritize tasks and meet delivery deadlines. Total Experience + 3+ years of relevant technology experience, demonstrating progressive responsibility and leadership in overseeing regional technology teams. + 5+ years team lead or project management experience + Demonstrated expertise in IT service management (ITSM) and observability tools, especially those that leverage AI for analytics + Hands-on experience with models like GPT, Claude, and their associated frameworks (e.g., RAG) is a key requirement + Practical experience running production systems, troubleshooting issues, and participating in on-call rotations is highly valued, building crucial intuition for real-world system failures. At AIG, we value in-person collaboration as a vital part of our culture, which is why we ask our team members to be primarily in the office. This approach helps us work together effectively and create a supportive, connected environment for our team and clients alike. Enjoy benefits that take care of what matters At AIG, our people are our greatest asset. We know how important it is to protect and invest in what’s most important to you. That is why we created our Total Rewards Program, a comprehensive benefits package that extends beyond time spent at work to offer benefits focused on your health, wellbeing and financial security—as well as your professional development—to bring peace of mind to you and your family. Reimagining insurance to make a bigger difference to the world American International Group, Inc. (AIG) is a global leader in commercial and personal insurance solutions; we are one of the world’s most far-reaching property casualty networks. It is an exciting time to join us — across our operations, we are thinking in new and innovative ways to deliver ever-better solutions to our customers. At AIG, you can go further to support individuals, businesses, and communities, helping them to manage risk, respond to times of uncertainty and discover new potential. We invest in our largest asset, our people, through continuous learning and development, in a culture that celebrates everyone for who they are and what they want to become. Welcome to a culture of inclusion We’re committed to creating a culture that truly respects and celebrates each other’s talents, backgrounds, cultures, opinions and goals. We foster a culture of inclusion and belonging through learning, cultural awareness activities and Employee Resource Groups (ERGs). With global chapters, ERGs are a cornerstone for our culture of inclusion. The talent of our people is one of AIG’s greatest assets, and we are honored that our drive for positive change has been recognized by numerous recent awards and accreditations. AIG provides equal opportunity to all qualified individuals regardless of race, color, religion, age, gender, gender expression, national origin, veteran status, disability or any other legally protected categories. AIG is committed to working with and providing reasonable accommodations to job applicants and employees with disabilities. If you believe you need a reasonable accommodation, please send an email to candidatecare@aig.com . Functional Area: IT - Information Technology AIG General Insurance Company, Ltd.
Confirm your E-mail: Send Email