About

The Challenge

Data center leaders face three converging pressures: AI/ML infrastructure demands requiring 3-5x power density, global capacity planning across distributed facilities, and operational excellence under relentless uptime requirements.

Most organizations struggle to find leaders who can operate at both tactical and strategic levels—who understand the technical realities of data center operations while driving business transformation.

My Approach

I’m the person organizations send in when they don’t know what to do—when the facility design doesn’t exist yet, the process needs building, or nobody wants to own the problem. I solve what others won’t or can’t.

I’ve spent 20+ years solving these problems at scale—from early career at Verizon Business (extensive international travel across US, EMEA, and APAC) to leading hyperscale operations at Salesforce (8.5MW, 40K sq ft, 99.99% uptime), managing multi-million dollar facility builds, and coordinating global infrastructure deployments across three continents.

I work effectively at every organizational level: I brief executives, coordinate with facilities teams, partner with engineering, and hold vendors accountable. I meet people where they are and show them how to get where they want to go.

Results I Deliver

International build leadership across US, EMEA, and APAC regions with local team training
General contractor coordination for multi-million dollar facility programs
M&A infrastructure integration (Commerce Cloud, Pardot) with zero business disruption
Team development of 5 direct reports and training of 50-60 engineers
Asset management of 20,000+ equipment pieces across multiple facilities
Built Dallas data center preventing $2M/month cost overruns
Led decommissioning of 22 facilities achieving $200K recurring savings
Maintained 100% uptime across mission-critical infrastructure (2025)

How I Work

I’m currently leading infrastructure operations at Salesforce while taking on select strategic advisory and interim leadership engagements. This dual-track approach lets me stay current with cutting-edge infrastructure challenges while helping organizations solve their most complex problems.

Available For:

Full-time VP/SVP roles: Exceptional opportunities leading infrastructure organizations at scale
Strategic advisory: 2-4 week assessments, ongoing retainer relationships
Interim leadership: 3-6 month build programs, turnaround situations, transition leadership
Executive coaching: Career acceleration for data center leaders moving to VP/SVP levels

Not Available For:

Short-term tactical projects
Roles without strategic impact
Opportunities below VP/Director level

Data Center Build & Program Leadership

International Deployment Experience

Led data center infrastructure projects across three continents (US, EMEA, APAC) with focus on:

Greenfield data center design and deployment
Brownfield facility expansions and capacity additions
Physical infrastructure: Power distribution systems, space planning, cooling systems, structured cabling (fiber/copper)
General contractor oversight for multi-million dollar build programs
International team training and operational handoff

Technical Leadership Scope:

Power Systems: Distribution planning, UPS management, capacity forecasting
Space Planning: Multi-data hall layouts, rack optimization, equipment placement
Cooling Systems: CRAC/CRAH coordination, cooling capacity management
Structured Cabling: Fiber backbone and copper distribution infrastructure
Equipment Installation: Server, network, and storage infrastructure deployment

Program Management:

Led 4-6 remote engineering teams across international regions
Coordinated builds across US data centers and EMEA/APAC facilities
Managed general contractors and multi-vendor ecosystems
Delivered operational training for newly deployed infrastructure

Technical Expertise & Core Competencies

Infrastructure & Operations

Data Center Design & Construction: Greenfield site design, brownfield expansion planning, physical infrastructure coordination (power systems, space planning, cooling systems, structured cabling fiber/copper), general contractor oversight, international team coordination → Prevents multi-million dollar overbuilds and capacity shortfalls through data-driven facility planning
Power & Cooling: 8.5MW facility operations, N+1/2N redundancy design, liquid cooling for AI workloads, power density optimization (traditional 5-8kW/rack to AI 40-60kW/rack) → Enables AI/ML infrastructure transformation while controlling energy costs and cooling requirements
Network Architecture: ToR/EoR switching designs, 100G/400G datacenter fabrics, BGP routing, software-defined networking (SDN), network automation → Delivers high-performance connectivity supporting millions of users with 99.99% uptime
Compute & Storage: HPC cluster design, GPU infrastructure for AI/ML workloads, high-availability storage systems (SAN/NAS), hyperconverged infrastructure → Supports business-critical applications and AI workloads with zero data loss and rapid recovery
Physical Infrastructure: Raised floor/slab design, hot aisle/cold aisle containment, cable management, structured cabling (fiber/copper), DCIM implementation → Optimizes operational efficiency and reduces cooling costs through proper airflow management

Strategic Planning & Program Management

Capacity Planning: 3-5 year infrastructure forecasting, demand modeling, financial optimization (TCO analysis), site selection and evaluation
Change Management: ITIL/ITSM frameworks, incident/problem management, release management, continuous improvement methodologies
Project Management: Multi-million dollar facility builds, decommissioning programs, vendor management, budget oversight ($2M+ monthly responsibility)
Risk Management: Business continuity planning, disaster recovery, compliance frameworks (SOC 2, ISO 27001 principles), audit coordination

AI/ML Infrastructure (Emerging Specialization)

AI Infrastructure Planning: GPU cluster architecture, high-bandwidth networking for distributed training, inference optimization, model deployment infrastructure
Power/Cooling for AI: Liquid cooling systems, rear-door heat exchangers, in-row cooling, power distribution for high-density racks
HPC Environments: Parallel computing architectures, InfiniBand/RoCE fabrics, job scheduling, performance optimization
Operational Intelligence: AI-assisted monitoring, predictive analytics, automated remediation, LLM-based operational tools

Tools & Platforms

Monitoring & Observability: Splunk, ServiceNow, PagerDuty, custom dashboards, alerting frameworks → Predictive operations reducing MTTR by 30% through AI-assisted incident detection
Automation & Scripting: Python, Bash, Ansible, infrastructure-as-code principles, CI/CD for infrastructure → Enables consistent, error-free deployments and reduces manual operational overhead
Documentation & Collaboration: Confluence, Jira, Slack, Obsidian (knowledge management), technical writing → Facilitates knowledge management across distributed teams and accelerates onboarding
Asset Management: DCIM platforms, Asset Force, inventory tracking systems, hardware lifecycle management → Prevents $500K+ in lost assets and enables strategic hardware refresh planning
Vendor Ecosystems: Cisco, Dell, HPE, VMware, Arista, NVIDIA (AI infrastructure), colocation providers (Equinix, QTS, Digital Realty) → Multi-vendor expertise prevents vendor lock-in and optimizes procurement costs

Industry Certifications & Standards

DCIS (Data Center Infrastructure Specialist) - International Data Center Authority
DCOS (Data Center Operations Specialist) - International Data Center Authority
DCOM (Data Center Operations Manager) - International Data Center Authority
ITIL Principles - IT Service Management frameworks
Data Center Tier Standards - Uptime Institute Tier I-IV designs
ASHRAE Thermal Guidelines - TC 9.9 standards for data center environments

Case Studies

Case Study 1: Multi-Million Dollar Data Center Exit Strategy

Challenge: Major tech company needed to vacate data center hosting 158 applications before lease expiration, requiring complex coordination across multiple teams while maintaining zero downtime.

Technical Context:

40,000 sq ft facility with mixed compute, storage, and networking infrastructure
Legacy systems requiring careful migration planning
Network re-architecture needed for new site connectivity
Hardware refresh opportunity aligned with exit timeline

Approach:

Led cross-functional team of 40+ stakeholders across 6 teams (Network, Storage, Compute, Applications, Security, Facilities)
Developed comprehensive migration strategy with parallel workstreams using RACI framework
Coordinated equipment relocation worth $1M/month in operational value (preventing replacement costs)
Established clear accountability and communication frameworks with weekly executive briefings
Implemented risk mitigation strategies including rollback procedures and pilot migrations

Result:

100% completion on schedule (12-month program delivered on time)
Zero downtime during entire migration (maintained SLA commitments)
$2M/month cost avoidance achieved through facility exit and consolidated operations
Process framework adopted for future facility exits across the organization
Team capability building: 6 team members gained program management experience

Key Insight: Large-scale infrastructure transformations succeed through stakeholder alignment and clear ownership—not just technical execution. The RACI framework and weekly executive visibility were critical to maintaining momentum and resolving blockers quickly.

Case Study 2: AI-Driven Operations Transformation

Challenge: Operations team needed to scale efficiency without headcount increases while handling growing infrastructure complexity and AI/ML workload demands.

Technical Context:

Operations team managing 8.5MW facility with 2000+ servers, 500+ network devices, 100+ storage arrays
Manual processes for incident triage, capacity planning, and change management
Growing demand for AI/ML workloads requiring new operational approaches
Need to demonstrate ROI of AI investments to executive leadership

Approach:

Built team capabilities in AI-driven operations through hands-on training and collaborative framework development
Developed frameworks for AI-assisted decision making in incident response and capacity planning
Built operational dashboards combining traditional monitoring (Splunk, ServiceNow) with AI-generated insights
Implemented LLM-based tools and predictive analytics for operational intelligence
Documented use cases and ROI metrics for executive reporting

Result:

Scaled operations without headcount increase (absorbed 20% capacity growth with same team size)
Positioned team for next-generation AI infrastructure demands (building expertise ahead of industry curve)
Measurable efficiency gains: 30% reduction in incident triage time, 25% faster capacity planning cycles
Collaborative framework for AI-driven operations implemented across team operations

Key Insight: AI transformation requires both technical implementation and cultural change—building team capabilities while proving business value. Early wins with small use cases (incident summarization, log analysis) built credibility for larger transformations.

Case Study 3: Global Capacity Planning Framework

Challenge: Distributed infrastructure across 3 continents lacked strategic capacity planning, leading to reactive decision-making and risk of multi-million dollar overbuilds or capacity shortfalls.

Technical Context:

Global infrastructure spanning US, EMEA, and APAC regions
Mixed colocation and owned facilities with different capacity constraints
Business growth forecasts varying by region and product line
Capital planning cycles requiring 18-month lead times for major builds
Legacy spreadsheet-based tracking insufficient for strategic planning

Approach:

Developed 60-day strategic capacity planning initiative with phased deliverables
Coordinated 4-6 remote teams across multiple regions through structured weekly syncs
Built data-driven forecasting models aligned with business growth using historical utilization trends
Established executive stakeholder review process with monthly steering committee
Created standardized capacity metrics and reporting dashboards (power, cooling, space, network)
Integrated vendor lead times and procurement cycles into planning models

Result:

Data-driven capacity recommendations preventing multi-million dollar overbuilds (deferred 2 planned expansions)
Strategic alignment across global infrastructure teams (unified planning process)
Framework adopted as standard for future capacity decisions (now organizational standard)
Executive visibility into infrastructure investment ROI (quarterly board-level reporting)
Identified $3M in cost avoidance opportunities through workload consolidation and decommissioning

Key Insight: Strategic capacity planning delivers business value by preventing both overspending and capacity crises—but only with executive alignment and cross-functional coordination. The monthly steering committee was critical for resolving regional conflicts and maintaining strategic focus.

Case Study 4: Decommissioning Program at Scale

Challenge: Organization needed to decommission 22 legacy facilities while maintaining operational continuity and achieving cost savings.

Technical Context:

22 legacy facilities across multiple regions (US, EMEA, APAC)
Mix of owned and leased facilities with varying lease terms
Hardware asset recovery opportunity (servers, networking, storage)
Compliance requirements for data sanitization and secure disposal
Coordination with real estate, legal, finance, and operations teams

Approach:

Designed comprehensive six-phase decommissioning framework (Assessment → Planning → Migration → Decommission → Disposal → Closeout)
Coordinated asset disposition and liquidation strategy with remarketing vendors
Managed stakeholder communication across multiple sites using project management tooling (Jira, Confluence)
Ensured compliance with security and data requirements (NIST 800-88 data sanitization standards)
Established decommissioning playbook adopted for future programs

Result:

$200K recurring cost savings achieved (lease terminations, power/cooling reduction, reduced maintenance contracts)
Zero operational incidents during decommissioning (maintained SLA commitments throughout)
Reusable process framework for future decommissioning projects (reduced planning time by 40% for subsequent decommissions)
Asset recovery maximized through strategic disposition ($150K in hardware resale value)
Compliance requirements met with zero audit findings

Key Insight: Effective decommissioning is not just about shutting down equipment—it’s about strategic asset management, risk mitigation, and capturing financial value. The structured six-phase framework and clear documentation were critical to scaling the program across 22 facilities.

My Career Journey

Current Role: Principal, Data Center Operations - Salesforce (2020-Present)

Leading operations for 8.5MW, 40,000 sq ft data center facility with focus on:

Strategic capacity planning and infrastructure modernization - Developed multi-year capacity roadmaps aligned with business growth
AI/ML infrastructure transformation - Pioneering GPU cluster operations and liquid cooling implementations
Cross-functional program leadership - Coordinating 40+ stakeholders across Network, Compute, Storage, Applications, Security, and Facilities teams
Operational excellence - Maintained 99.99% uptime over 4 years, achieved 100% uptime in 2025

Key Achievements:

Led $2M+ infrastructure exit programs with zero downtime and 100% on-schedule delivery
Built team culture around AI-driven operations with measurable efficiency gains
Managed transition from traditional 5-8kW/rack to AI-optimized 40-60kW/rack power density
Led network refresh program upgrading 100G to 400G datacenter fabric

Technical Leadership:

Developed 5 direct reports managing daily operations
Coordinated 4-6 remote engineering teams for international builds
Trained 50-60 engineers on hardware repair and operational processes (10-20 hours per engineer)
Managed 20,000+ equipment pieces from M&A acquisitions into asset management systems

Previous Experience: Network Engineer - Verizon Business (2006-2014)

Managed enterprise network infrastructure and international data center operations:

Extensive International Travel: >50% travel for 7 years supporting deployments across US, EMEA, and APAC regions
Global deployments - Coordinated international infrastructure rollouts across multiple continents
Multi-vendor coordination (enterprise to hyperscale environments) - Cisco, Juniper, Arista, Dell, HPE ecosystems
Customer-facing technical leadership - Supported Fortune 500 clients with mission-critical infrastructure
Network architecture and capacity planning - Designed MPLS, BGP, and data center networking solutions
30-site upgrade program - Led equipment installation, upgrades, and comprehensive documentation
576-fiber optical solution - Designed and deployed multi-node transport network
Asset coordination - Managed multiple vendors for hardware installations at remote facilities

Military Service

United States Air Force (1996-2000)

Wideband Technician | Andrews Air Force Base, Maryland

Supported mission-critical communications infrastructure:

Presidential communications: Maintained radio station supporting White House operations
Joint Chiefs of Staff: Supported strategic military communications systems
Defense Information Systems Agency: Managed secure government circuits
Technical expertise: Radio systems, telecommunications infrastructure, secure communications
Operational discipline: Foundation for data center operations excellence and safety-first approach

Technical training and operational experience in military communications infrastructure established foundation for civilian data center career.

Education & Credentials

BS Technical Communications - Northeastern University
DCIS (Data Center Infrastructure Specialist) - International Data Center Authority
DCOS (Data Center Operations Specialist) - International Data Center Authority
DCOM (Data Center Operations Manager) - International Data Center Authority
U.S. Air Force Veteran - Technical training and operational discipline

What Makes Me Different

1. The Operator-Strategist Rare Breed

Most data center leaders are either operators (fixing problems, keeping lights on) OR strategists (planning, PowerPoint presentations). Almost none do both well.

I maintain 99.99% uptime on 8.5MW operations while developing 3-year capacity plans that prevent million-dollar overbuilds. I can troubleshoot a cooling failure at 2am and present ROI analysis to the board at 2pm.

Why This Matters: You get someone who can both execute flawlessly and think strategically—not a consultant who talks about data centers from conference rooms.

2. Global Build Experience (Not Just US-Centric)

Most US-based infrastructure leaders have never built a data center outside North America. I’ve led projects across three continents:

US Region: Multiple facility deployments including 8.5MW Dallas operations
EMEA Region: Data center builds with local team training
APAC Region: Multiple facility deployments with distributed engineering coordination

I understand regulatory requirements, vendor ecosystems, cultural coordination, and time zone management that make international builds succeed or fail.

Why This Matters: If you’re planning global expansion, you don’t want to learn hard lessons on your first EMEA deployment. I’ve already learned them.

3. The Full Lifecycle Advantage

Most leaders specialize: They’re either build experts (design/construction) OR operations experts (run the facility). Very few do both.

I’ve designed facilities, managed general contractors, commissioned systems, AND operated them at 99.99% uptime for years. This means:

My build designs actually work for operations teams (not just look good on paper)
My operations expertise informs capacity planning and expansion designs
I prevent the expensive “build it wrong, fix it later” cycle

Why This Matters: Building a data center that’s hard to operate costs millions in retrofits and lost efficiency. I prevent that.

4. Business Impact Focus: Decisions That Save Millions

Example 1: Built Dallas data center preventing $2M/month cost overruns through strategic consolidation

Example 2: Deferred $3M in unnecessary capacity expansions through data-driven planning

Example 3: Achieved $200K recurring savings through strategic decommissioning of 22 facilities

I don’t just build things and keep them running. I make decisions that impact the bottom line.

Why This Matters: Every data center decision is a multi-million dollar bet. You want someone who thinks about ROI, not just uptime.

Let’s Talk Infrastructure

I work with two types of people:

Infrastructure Leaders navigating AI/ML data center planning, facility builds, or operational transformation—you need someone who’s been there and can cut through the noise.

Organizations facing capacity decisions, M&A integrations, or build programs—you need expertise that prevents million-dollar mistakes.

If you’re dealing with complex infrastructure challenges and want straight talk from someone who’s actually built and operated these systems at scale, let’s connect.

Email: chet@chetlwilliams.com LinkedIn: linkedin.com/in/chetwilliams1 Mastodon: @chetwilliams@fosstodon.org Telegram: @chetwilliams Signal: Signal Message Response Time: Within 24 hours

Complimentary 30-minute discovery calls available. No sales pitch—just honest assessment of whether I can help.

Last Updated: 2026-03-30