The Challenge
Data center leaders face three converging pressures: AI/ML infrastructure demands requiring 3-5x power density, global capacity planning across distributed facilities, and operational excellence under relentless uptime requirements.
Most organizations struggle to find leaders who can operate at both tactical and strategic levels—who understand the technical realities of data center operations while driving business transformation.
My Approach
I’m the person organizations send in when they don’t know what to do—when the facility design doesn’t exist yet, the process needs building, or nobody wants to own the problem. I solve what others won’t or can’t.
I’ve spent 20+ years solving these problems at scale—from early career at Verizon Business (extensive international travel across US, EMEA, and APAC) to leading hyperscale operations at Salesforce (8.5MW, 40K sq ft, 99.99% uptime), managing multi-million dollar facility builds, and coordinating global infrastructure deployments across three continents.
I work effectively at every organizational level: I brief executives, coordinate with facilities teams, partner with engineering, and hold vendors accountable. I meet people where they are and show them how to get where they want to go.
Results I Deliver
- International build leadership across US, EMEA, and APAC regions with local team training
- General contractor coordination for multi-million dollar facility programs
- M&A infrastructure integration (Commerce Cloud, Pardot) with zero business disruption
- Team development of 5 direct reports and training of 50-60 engineers
- Asset management of 20,000+ equipment pieces across multiple facilities
- Built Dallas data center preventing $2M/month cost overruns
- Led decommissioning of 22 facilities achieving $200K recurring savings
- Maintained 100% uptime across mission-critical infrastructure (2025)
How I Work
I’m currently leading infrastructure operations at Salesforce while taking on select strategic advisory and interim leadership engagements. This dual-track approach lets me stay current with cutting-edge infrastructure challenges while helping organizations solve their most complex problems.
Available For:
- Full-time VP/SVP roles: Exceptional opportunities leading infrastructure organizations at scale
- Strategic advisory: 2-4 week assessments, ongoing retainer relationships
- Interim leadership: 3-6 month build programs, turnaround situations, transition leadership
- Executive coaching: Career acceleration for data center leaders moving to VP/SVP levels
Not Available For:
- Short-term tactical projects
- Roles without strategic impact
- Opportunities below VP/Director level
Data Center Build & Program Leadership
International Deployment Experience
Led data center infrastructure projects across three continents (US, EMEA, APAC) with focus on:
- Greenfield data center design and deployment
- Brownfield facility expansions and capacity additions
- Physical infrastructure: Power distribution systems, space planning, cooling systems, structured cabling (fiber/copper)
- General contractor oversight for multi-million dollar build programs
- International team training and operational handoff
Technical Leadership Scope:
- Power Systems: Distribution planning, UPS management, capacity forecasting
- Space Planning: Multi-data hall layouts, rack optimization, equipment placement
- Cooling Systems: CRAC/CRAH coordination, cooling capacity management
- Structured Cabling: Fiber backbone and copper distribution infrastructure
- Equipment Installation: Server, network, and storage infrastructure deployment
Program Management:
- Led 4-6 remote engineering teams across international regions
- Coordinated builds across US data centers and EMEA/APAC facilities
- Managed general contractors and multi-vendor ecosystems
- Delivered operational training for newly deployed infrastructure
Technical Expertise & Core Competencies
Infrastructure & Operations
Data Center Design & Construction: Greenfield site design, brownfield expansion planning, physical infrastructure coordination (power systems, space planning, cooling systems, structured cabling fiber/copper), general contractor oversight, international team coordination → Prevents multi-million dollar overbuilds and capacity shortfalls through data-driven facility planning
Power & Cooling: 8.5MW facility operations, N+1/2N redundancy design, liquid cooling for AI workloads, power density optimization (traditional 5-8kW/rack to AI 40-60kW/rack) → Enables AI/ML infrastructure transformation while controlling energy costs and cooling requirements
Network Architecture: ToR/EoR switching designs, 100G/400G datacenter fabrics, BGP routing, software-defined networking (SDN), network automation → Delivers high-performance connectivity supporting millions of users with 99.99% uptime
Compute & Storage: HPC cluster design, GPU infrastructure for AI/ML workloads, high-availability storage systems (SAN/NAS), hyperconverged infrastructure → Supports business-critical applications and AI workloads with zero data loss and rapid recovery
Physical Infrastructure: Raised floor/slab design, hot aisle/cold aisle containment, cable management, structured cabling (fiber/copper), DCIM implementation → Optimizes operational efficiency and reduces cooling costs through proper airflow management
Strategic Planning & Program Management
- Capacity Planning: 3-5 year infrastructure forecasting, demand modeling, financial optimization (TCO analysis), site selection and evaluation
- Change Management: ITIL/ITSM frameworks, incident/problem management, release management, continuous improvement methodologies
- Project Management: Multi-million dollar facility builds, decommissioning programs, vendor management, budget oversight ($2M+ monthly responsibility)
- Risk Management: Business continuity planning, disaster recovery, compliance frameworks (SOC 2, ISO 27001 principles), audit coordination
AI/ML Infrastructure (Emerging Specialization)
- AI Infrastructure Planning: GPU cluster architecture, high-bandwidth networking for distributed training, inference optimization, model deployment infrastructure
- Power/Cooling for AI: Liquid cooling systems, rear-door heat exchangers, in-row cooling, power distribution for high-density racks
- HPC Environments: Parallel computing architectures, InfiniBand/RoCE fabrics, job scheduling, performance optimization
- Operational Intelligence: AI-assisted monitoring, predictive analytics, automated remediation, LLM-based operational tools
Tools & Platforms
Monitoring & Observability: Splunk, ServiceNow, PagerDuty, custom dashboards, alerting frameworks → Predictive operations reducing MTTR by 30% through AI-assisted incident detection
Automation & Scripting: Python, Bash, Ansible, infrastructure-as-code principles, CI/CD for infrastructure → Enables consistent, error-free deployments and reduces manual operational overhead
Documentation & Collaboration: Confluence, Jira, Slack, Obsidian (knowledge management), technical writing → Facilitates knowledge management across distributed teams and accelerates onboarding
Asset Management: DCIM platforms, Asset Force, inventory tracking systems, hardware lifecycle management → Prevents $500K+ in lost assets and enables strategic hardware refresh planning
Vendor Ecosystems: Cisco, Dell, HPE, VMware, Arista, NVIDIA (AI infrastructure), colocation providers (Equinix, QTS, Digital Realty) → Multi-vendor expertise prevents vendor lock-in and optimizes procurement costs
Industry Certifications & Standards
- DCIS (Data Center Infrastructure Specialist) - International Data Center Authority
- DCOS (Data Center Operations Specialist) - International Data Center Authority
- DCOM (Data Center Operations Manager) - International Data Center Authority
- ITIL Principles - IT Service Management frameworks
- Data Center Tier Standards - Uptime Institute Tier I-IV designs
- ASHRAE Thermal Guidelines - TC 9.9 standards for data center environments
Case Studies
Case Study 1: Multi-Million Dollar Data Center Exit Strategy
Challenge: Major tech company needed to vacate data center hosting 158 applications before lease expiration, requiring complex coordination across multiple teams while maintaining zero downtime.
Technical Context:
- 40,000 sq ft facility with mixed compute, storage, and networking infrastructure
- Legacy systems requiring careful migration planning
- Network re-architecture needed for new site connectivity
- Hardware refresh opportunity aligned with exit timeline
Approach:
- Led cross-functional team of 40+ stakeholders across 6 teams (Network, Storage, Compute, Applications, Security, Facilities)
- Developed comprehensive migration strategy with parallel workstreams using RACI framework
- Coordinated equipment relocation worth $1M/month in operational value (preventing replacement costs)
- Established clear accountability and communication frameworks with weekly executive briefings
- Implemented risk mitigation strategies including rollback procedures and pilot migrations
Result:
- 100% completion on schedule (12-month program delivered on time)
- Zero downtime during entire migration (maintained SLA commitments)
- $2M/month cost avoidance achieved through facility exit and consolidated operations
- Process framework adopted for future facility exits across the organization
- Team capability building: 6 team members gained program management experience
Key Insight: Large-scale infrastructure transformations succeed through stakeholder alignment and clear ownership—not just technical execution. The RACI framework and weekly executive visibility were critical to maintaining momentum and resolving blockers quickly.
Case Study 2: AI-Driven Operations Transformation
Challenge: Operations team needed to scale efficiency without headcount increases while handling growing infrastructure complexity and AI/ML workload demands.
Technical Context:
- Operations team managing 8.5MW facility with 2000+ servers, 500+ network devices, 100+ storage arrays
- Manual processes for incident triage, capacity planning, and change management
- Growing demand for AI/ML workloads requiring new operational approaches
- Need to demonstrate ROI of AI investments to executive leadership
Approach:
- Built team capabilities in AI-driven operations through hands-on training and collaborative framework development
- Developed frameworks for AI-assisted decision making in incident response and capacity planning
- Built operational dashboards combining traditional monitoring (Splunk, ServiceNow) with AI-generated insights
- Implemented LLM-based tools and predictive analytics for operational intelligence
- Documented use cases and ROI metrics for executive reporting
Result:
- Scaled operations without headcount increase (absorbed 20% capacity growth with same team size)
- Positioned team for next-generation AI infrastructure demands (building expertise ahead of industry curve)
- Measurable efficiency gains: 30% reduction in incident triage time, 25% faster capacity planning cycles
- Collaborative framework for AI-driven operations implemented across team operations
Key Insight: AI transformation requires both technical implementation and cultural change—building team capabilities while proving business value. Early wins with small use cases (incident summarization, log analysis) built credibility for larger transformations.
Case Study 3: Global Capacity Planning Framework
Challenge: Distributed infrastructure across 3 continents lacked strategic capacity planning, leading to reactive decision-making and risk of multi-million dollar overbuilds or capacity shortfalls.
Technical Context:
- Global infrastructure spanning US, EMEA, and APAC regions
- Mixed colocation and owned facilities with different capacity constraints
- Business growth forecasts varying by region and product line
- Capital planning cycles requiring 18-month lead times for major builds
- Legacy spreadsheet-based tracking insufficient for strategic planning
Approach:
- Developed 60-day strategic capacity planning initiative with phased deliverables
- Coordinated 4-6 remote teams across multiple regions through structured weekly syncs
- Built data-driven forecasting models aligned with business growth using historical utilization trends
- Established executive stakeholder review process with monthly steering committee
- Created standardized capacity metrics and reporting dashboards (power, cooling, space, network)
- Integrated vendor lead times and procurement cycles into planning models
Result:
- Data-driven capacity recommendations preventing multi-million dollar overbuilds (deferred 2 planned expansions)
- Strategic alignment across global infrastructure teams (unified planning process)
- Framework adopted as standard for future capacity decisions (now organizational standard)
- Executive visibility into infrastructure investment ROI (quarterly board-level reporting)
- Identified $3M in cost avoidance opportunities through workload consolidation and decommissioning
Key Insight: Strategic capacity planning delivers business value by preventing both overspending and capacity crises—but only with executive alignment and cross-functional coordination. The monthly steering committee was critical for resolving regional conflicts and maintaining strategic focus.
Case Study 4: Decommissioning Program at Scale
Challenge: Organization needed to decommission 22 legacy facilities while maintaining operational continuity and achieving cost savings.
Technical Context:
- 22 legacy facilities across multiple regions (US, EMEA, APAC)
- Mix of owned and leased facilities with varying lease terms
- Hardware asset recovery opportunity (servers, networking, storage)
- Compliance requirements for data sanitization and secure disposal
- Coordination with real estate, legal, finance, and operations teams
Approach:
- Designed comprehensive six-phase decommissioning framework (Assessment → Planning → Migration → Decommission → Disposal → Closeout)
- Coordinated asset disposition and liquidation strategy with remarketing vendors
- Managed stakeholder communication across multiple sites using project management tooling (Jira, Confluence)
- Ensured compliance with security and data requirements (NIST 800-88 data sanitization standards)
- Established decommissioning playbook adopted for future programs
Result:
- $200K recurring cost savings achieved (lease terminations, power/cooling reduction, reduced maintenance contracts)
- Zero operational incidents during decommissioning (maintained SLA commitments throughout)
- Reusable process framework for future decommissioning projects (reduced planning time by 40% for subsequent decommissions)
- Asset recovery maximized through strategic disposition ($150K in hardware resale value)
- Compliance requirements met with zero audit findings
Key Insight: Effective decommissioning is not just about shutting down equipment—it’s about strategic asset management, risk mitigation, and capturing financial value. The structured six-phase framework and clear documentation were critical to scaling the program across 22 facilities.
My Career Journey
Current Role: Principal, Data Center Operations - Salesforce (2020-Present)
Leading operations for 8.5MW, 40,000 sq ft data center facility with focus on:
- Strategic capacity planning and infrastructure modernization - Developed multi-year capacity roadmaps aligned with business growth
- AI/ML infrastructure transformation - Pioneering GPU cluster operations and liquid cooling implementations
- Cross-functional program leadership - Coordinating 40+ stakeholders across Network, Compute, Storage, Applications, Security, and Facilities teams
- Operational excellence - Maintained 99.99% uptime over 4 years, achieved 100% uptime in 2025
Key Achievements:
- Led $2M+ infrastructure exit programs with zero downtime and 100% on-schedule delivery
- Built team culture around AI-driven operations with measurable efficiency gains
- Managed transition from traditional 5-8kW/rack to AI-optimized 40-60kW/rack power density
- Led network refresh program upgrading 100G to 400G datacenter fabric
Technical Leadership:
- Developed 5 direct reports managing daily operations
- Coordinated 4-6 remote engineering teams for international builds
- Trained 50-60 engineers on hardware repair and operational processes (10-20 hours per engineer)
- Managed 20,000+ equipment pieces from M&A acquisitions into asset management systems
Previous Experience: Network Engineer - Verizon Business (2006-2014)
Managed enterprise network infrastructure and international data center operations:
- Extensive International Travel: >50% travel for 7 years supporting deployments across US, EMEA, and APAC regions
- Global deployments - Coordinated international infrastructure rollouts across multiple continents
- Multi-vendor coordination (enterprise to hyperscale environments) - Cisco, Juniper, Arista, Dell, HPE ecosystems
- Customer-facing technical leadership - Supported Fortune 500 clients with mission-critical infrastructure
- Network architecture and capacity planning - Designed MPLS, BGP, and data center networking solutions
- 30-site upgrade program - Led equipment installation, upgrades, and comprehensive documentation
- 576-fiber optical solution - Designed and deployed multi-node transport network
- Asset coordination - Managed multiple vendors for hardware installations at remote facilities
Military Service
United States Air Force (1996-2000)
Wideband Technician | Andrews Air Force Base, Maryland
Supported mission-critical communications infrastructure:
- Presidential communications: Maintained radio station supporting White House operations
- Joint Chiefs of Staff: Supported strategic military communications systems
- Defense Information Systems Agency: Managed secure government circuits
- Technical expertise: Radio systems, telecommunications infrastructure, secure communications
- Operational discipline: Foundation for data center operations excellence and safety-first approach
Technical training and operational experience in military communications infrastructure established foundation for civilian data center career.
Education & Credentials
- BS Technical Communications - Northeastern University
- DCIS (Data Center Infrastructure Specialist) - International Data Center Authority
- DCOS (Data Center Operations Specialist) - International Data Center Authority
- DCOM (Data Center Operations Manager) - International Data Center Authority
- U.S. Air Force Veteran - Technical training and operational discipline
What Makes Me Different
1. The Operator-Strategist Rare Breed
Most data center leaders are either operators (fixing problems, keeping lights on) OR strategists (planning, PowerPoint presentations). Almost none do both well.
I maintain 99.99% uptime on 8.5MW operations while developing 3-year capacity plans that prevent million-dollar overbuilds. I can troubleshoot a cooling failure at 2am and present ROI analysis to the board at 2pm.
Why This Matters: You get someone who can both execute flawlessly and think strategically—not a consultant who talks about data centers from conference rooms.
2. Global Build Experience (Not Just US-Centric)
Most US-based infrastructure leaders have never built a data center outside North America. I’ve led projects across three continents:
- US Region: Multiple facility deployments including 8.5MW Dallas operations
- EMEA Region: Data center builds with local team training
- APAC Region: Multiple facility deployments with distributed engineering coordination
I understand regulatory requirements, vendor ecosystems, cultural coordination, and time zone management that make international builds succeed or fail.
Why This Matters: If you’re planning global expansion, you don’t want to learn hard lessons on your first EMEA deployment. I’ve already learned them.
3. The Full Lifecycle Advantage
Most leaders specialize: They’re either build experts (design/construction) OR operations experts (run the facility). Very few do both.
I’ve designed facilities, managed general contractors, commissioned systems, AND operated them at 99.99% uptime for years. This means:
- My build designs actually work for operations teams (not just look good on paper)
- My operations expertise informs capacity planning and expansion designs
- I prevent the expensive “build it wrong, fix it later” cycle
Why This Matters: Building a data center that’s hard to operate costs millions in retrofits and lost efficiency. I prevent that.
4. Business Impact Focus: Decisions That Save Millions
Example 1: Built Dallas data center preventing $2M/month cost overruns through strategic consolidation
Example 2: Deferred $3M in unnecessary capacity expansions through data-driven planning
Example 3: Achieved $200K recurring savings through strategic decommissioning of 22 facilities
I don’t just build things and keep them running. I make decisions that impact the bottom line.
Why This Matters: Every data center decision is a multi-million dollar bet. You want someone who thinks about ROI, not just uptime.
Let’s Talk Infrastructure
I work with two types of people:
Infrastructure Leaders navigating AI/ML data center planning, facility builds, or operational transformation—you need someone who’s been there and can cut through the noise.
Organizations facing capacity decisions, M&A integrations, or build programs—you need expertise that prevents million-dollar mistakes.
If you’re dealing with complex infrastructure challenges and want straight talk from someone who’s actually built and operated these systems at scale, let’s connect.
Email: chet@chetlwilliams.com LinkedIn: linkedin.com/in/chetwilliams1 Mastodon: @chetwilliams@fosstodon.org Telegram: @chetwilliams Signal: Signal Message Response Time: Within 24 hours
Complimentary 30-minute discovery calls available. No sales pitch—just honest assessment of whether I can help.
Last Updated: 2026-03-30