Monday, November 11, 2019

Homelab: Update


I promised an update when the rework was complete, and here it is.

First, the rack and stack. This took a bit of time, and a bit of patience as I sourced the best deals on my favorite auction site. My patience paid off, and here's the finished product. From top down:

Ubiquiti UniFi USW-XG (16 port 10Gb ethernet)
AC Infinity sensor based intelligent fan
Ubiquiti UniFi USW-16 (18 port 1Gb ethernet)
Leviton commercial power conditioner
Dell EMC PowerEdge R620 x4
2014 Mac Mini (home media server)
Drobo 5c (connected to Mac) with 5x 1.5TB SSD
Drobo 5n (clients, vCenter backups) with 5x 8TB HDD

All of this is connected to my Ubiquiti UniFi based system. The only component of the network that is not UniFi is a SonicWall TZ350 just before my Comcast Business CPE.

This home lab is surprisingly quiet and consumes (again, surprisingly) much less power than I originally planned for or had available when I built it.

It's all virtualized using vSphere (of course), managed by vCenter and vRealize Log Insight (VMUG Advantage is awesome by the way), and I recently installed a TIG stack (Telegraf, InfluxDB and Grafana) to see what kind of metrics I could get out of it. Questions / comments welcome.





/finis







Saturday, November 02, 2019

Homelab: connectivity

I've posted a little bit about my home lab, and have recently consolidated everything into a single rack. I'm still waiting on a few components for the rack to address recirculation and aesthetics, so I'll wait to publish pictures until that's complete.

I'm a big fan of Ubiquiti UniFi products, and they suit a lab's needs well. If you're looking for a managed solution of simple layer 2 switches, go check them out. I'll publish links to the products as I describe them.

My lab is connected to my home network via the default VLAN. That's the only way into the lab. Everything else is isolated within the lab networks. My home network consists of a pretty robust firewall, and everything behind it is Ubiquiti.

I have a 1Gb fiber connection between the switch that serves my home and the lab "core" switch. The lab core is a UniFi Switch 16 XG (link) that offers (12) 1/10 Gb/sec SFP+ capable ports, and (4) 1/10 Gb/sec RJ45 ports. It is connected to a UniFi Switch 16 (link) and a UniFi Switch 8 (link).

The VLAN configuration simplifies everything in the network. Rather than worry about port assignments, VLAN to port tagging, etc; I decided to create my distributed virtual switches with the VLAN tag in the Distributed Port Groups. This way, I can maintain flexibility and simplicity. The only exception in this scheme is in the connections to the NAS platform which is connected to the Default VLAN and is accessible from both the home network and the lab network.

The server connectivity is shown to the right.

I didn't have a 24 port switch, so I decided to separate management and provisioning. There's not really a need to do that for a small environment, but I could - so I did.

The vMotion and vSAN ports are separate, and the DVS' are using separate VLANs in the connections. I could have used a LACP connection on these ports but in the interest of simplicity, these connections are separate 10Gb/sec using SFP+ DACs.

Hopefully if you're building a server based home lab, you find this helpful. Comments / questions welcomed below.

/finis

Homelab: The quest for the circle of trust

NOTICE: This contains some advanced and potentially dangerous configuration steps. If you're at all uncertain on this, please don't do it. I cannot assume any responsibility for your system or information security. This worked for me, and may introduce serious risk to your own system. Know what you're doing, and how to undo it - or don't read this.

I would like to address an issue that has come up with Mac OS Catalina (10.15.x). Besides the rapid release of fixes, etc associated with iOS 13 and Catalina, one other issue has arisen that I found the workaround for. It truly is a workaround, and appears to affect ONLY Chrome on Catalina.

NET::ERR_CERT_REVOKED

SSL certificates are a pain by any measure, and self-signing isn't working anymore on Chrome / Catalina. SO, you can either get / create your own (a massive pain), or follow the steps below.

The NET:ERR_CERT_REVOKED message can't be bypassed like some SSL errors that Chrome reports. In the case where you're on the internet or looking into a system that you're not completely familiar with, this is a good thing. However, in the case where you KNOW the system (home labs are a perfect example), this is a royal pain.

So, upon connecting to my lab post-upgrade (to Mac OS Catalina), I received this message on all of my "home" systems. Connecting via Safari worked, as did connecting via Firefox - so I knew it was (1) a certificate issue, and (2) Chrome. Here's the workaround:


1. Open the URL in Safari ex: 192.168.1.200. You will receive the usual SSL message. 
Select "Show Details"
2. Here's a little known Mac OS trick. Once you view the details of the offending certificate in Safari, you can drag the certificate to your desktop by click / hold / drag the image. You'll then have your certificate on your desktop.

3. Once it's there, open "Keychain Access" and drag the certificate into your certificate store.  Once there, you need to expand the "Trust" section at the top and then select "Always Trust". This will then allow you to connect via Chrome. PLEASE NOTE: If you are at all unsure about what you're doing here, please do not do it. This bypasses a VERY significant security feature of Mac OS and Chrome. I am only doing this because I trust these systems.

I hope this works for you. I would also STRONGLY state that this process should NEVER be used on any SSL protected connection that you are not 100% responsible for, and definitely not for something outside of your own network and control.


/finis

Thursday, October 17, 2019

PowerEdge and vSphere. My home lab upgrade

So I just finished installing my new Dell EMC PowerEdge servers in my home lab. The difference one generation of server makes is astounding. The new machines are pretty stout and will serve well in the experiments and learning I want / need to do.

Home labs get budget racks
4x Dell EMC PowerEdge R620 (Sandy Bridge EP)
- 2x Intel Xeon E5-2670 2.6GhZ eight core CPU
- 128GB RAM each
- 1x Dell 400GB SAS SSD (vSAN Cache tier)
- 2x Dell 1.2TB SAS 10k (vSAN capacity tier)
- 2x Dell 600GB SAS 10k (local datastore)
- 2x Samsung 16GB SD-Card (boot)

Ubiquiti UniFi 16 port 1Gb switch (+ 2 1Gb SFP)
Ubiquiti UniFi 16 port 10Gb switch
Ubiquiti UniFi 8 port 1Gb switch
Spanning Tree enabled
Uplinked to my "home network" but isolated from
it except for management (all workloads isolated but internet accessible)

The nodes are connected as follows:
- iDRAC is on a dedicated VLAN (16 port UniFi)
- eth4 is on VLAN 1 (16 port UniFi)
- eth3 is on a dedicated routed VLAN (8 port UniFi)
- eth5 is on a 10Gb SFP+ DAC for VMTN (closed VLAN)
- eth6 is on a 10Gb SFP+ DAC for vSAN (closed VLAN)

Configuring these machines was SO simple.

  1. Since I bought them used, I connected to the iDRAC first and downloaded the Enterprise license key. I then reset the iDRAC. This took a few minutes - but trust me - it's worth it to not have to slog through troubleshooting only to find out some obscure setting was in your way.
  2. Once that was finished, I connected a local keyboard and monitor to each server and set the static IP address, admin user, and a few other options. This can be done remotely, but it's kind of a pain to discover the iDRAC and have to reconnect. The 5 minutes it took was worth the "in person" visit to my basement.
  3. I then used Virtual Media to mount the Dell EMC Remote Update ISO. If you're not already aware of this gift - get aware. It's an ISO image (so could be burned to DVD and run locally if you wanted to) that I mounted to the virtual CD and booted the server from. Think of this as a run-time out of band Lifecycle Management tool for all of the devices in your compute node. It updates everything it finds to the versions on the ISO and restarts the system.

    You can find the ISO for your system here.
  4. I proceeded then to mount the vSphere (Dell EMC custom build ISO) image and installed vSphere to the SD-Card. 
Once all of that was finished, I configured my DVS' and vmKernel NIC's and was ready to start playing. 

But wait... there's more...

Backstory: Every Dell EMC PowerEdge contains a Lifecycle Management utility in its' pre-boot environment. This LCM process allows you to connect to Dell from any internet accessible network and - just like the ISO in step #3 it will analyze everything in your system and offer to update it. Since the ISO I downloaded in step #3 was from July, there were most certainly updates issued by Dell EMC since then.

Anyone want to buy some R610's and a NetApp 10Gb switch?
So, I configured everything - including vSAN and stuff is running beautifully. I then put Server #1 into Maintenance mode (vSAN is configured for FTT1) and proceeded to reboot into the LCM. Sure enough, it found several firmware items that were newer than what was installed, so I let it do its' thing.

vSAN is magic. Period. It has come SO far in so short a period of time - I'm a HUGE fanboy. The LCM process on Server #1 took about 45 minutes. Tons of time before vSAN rebuild starts. Except I'm an idiot. I got distracted - forgot the server was updating - and what do you know... 90 minutes or so after the LCM started, I realized it was finished and rebooted ESX.

Before ESX completely booted, vSAN stopped rebuilding and re-synchronized the node with the other 3 members. 

I'm really enjoying my time with VMware. I'm hoping (now that I have enough CPU and RAM) to start messing around with PKS and, later, OpenShift. I'll continue to update...


Proudly displayed on the wall in my "lab" because why not?


/finis

Friday, September 27, 2019

At long last...

Across two companies, what feels like an eternity (actually only 2 1/2 years), and many many iterations, my "pet project" is finally announced. I can tell you that our partners are excited, our teams are energized, and we can't get out to talk to all of the customers that want this fast enough.

Hybrid Cloud is nothing new, certainly not in the land of marketing and buzzword bingo, but Hybrid Cloud in an appliance form factor that focuses solely on "high value workloads" is certainly something that is elusive.

The preview announcement, made at SAP TechEd by Sven Denecken (SVP, SAP S/4 HANA) described an industry partnership to bring a fully managed hybrid experience to customers that run SAP workloads.

With the 2025 deadline approaching, customers migration journeys are under way to S/4 HANA, and with these journeys comes infrastructure questions and choices. The position that Dell Technologies and our partners are taking is that customers' consumption of these technologies shouldn't be a "cloud OR..." question; rather it's a "cloud AND..." question. This consumption model provides great flexibility as to where workloads are run, and enables workload mobility and management based on an SLA - not based on a location.


The Kinetic Hybrid Cloud for SAP is brought to you by Deloitte, Dell Technologies and Intel. This simple diagram shows the Unified Operations and workflow integration that Deloitte and Dell Technologies bring to SAP workloads across multiple datacenter instances.

Dell Boomi provides integration of data sources and applications outside of the SAP Application ecosystem; SAP has intelligent integration points within the SAP Application ecosystem; Deloitte brings Hybrid Cloud Management to the horizontal suite of deployment solutions; and Dell EMC brings complete infrastructure management for the on-premises infrastructure components.

All of this is managed under one SLA, one price, and a single engagement model through Dell Technologies and Deloitte.

I'm really excited about this combination of companies, technologies and people. This is only the beginning - and I'll share much more when able to.

/finis

Saturday, September 07, 2019

Your own personal datacenter


About a year ago, I decided to build my own personal datacenter. I won't go into gory specifics, except to say that I tried to buy a VxRail cluster, and discovered that I am not a billionaire or a business with a capital budget.

SO off to my favorite auction site I went, and found a great assortment of gently used Dell EMC Poweredge servers. I chose 4 of them, bought a couple of Ubiquiti switches to add to my home network, and off I went.

Here's how things are currently cabled:


And here is what my home cloud is running:


It's all VMware based, but I'll be adding Red Hat OpenShift and Kabanero.io once the RAM I ordered arrives. Fun times ahead.

/finis



Back in the saddle

Well, I took a brief hiatus, and 4 years later I'm back. I'm going to attempt to keep this up to date with tips, tricks and general palaver...

Introduction:
I've never had a reason to run VMware vSphere from a professional perspective. Several years ago, I became very interested in it, and began running a small embedded vSphere lab in our company's lab. Aside from simply messing around, I learned slowly the how / what / why things work the way they do. Fast forward to today, and I'm running a full-on hyperconverged data center in my basement. It makes my wife very happy...

I have come to rely on a few extremely smart friends for advice, help, and so on - and they have been amazing with the amount of knowledge they're willing to share. One of them is zsoldier, a friend and colleague I'm blessed to know. Find him here: https://tech.zsoldier.com/ - and I'm very thankful for his wisdom and friendship.


Problem: vCenter Server failed with a disk capacity issue for its' database.

I had been away on business, and when I came back, I checked in on my vSphere cluster, and vCenter was up but not really. When I started digging in (https://vcenter.local:5480), I discovered that the database (seat) was out of space, and the vSphere Server couldn't start.

I started researching this, and found that this issue is pretty well documented (here: https://kb.vmware.com/s/article/2145603 and here: https://kb.vmware.com/s/article/2126276).

It literally took longer to restart services and back up vCenter than it did to fix the issue. I had considerable trouble identifying the exact volume to expand, because the default vCenter with embedded PSC installation process creates a bunch of 10GB volumes. Nothing is labeled, so seat could be on any of these volumes.


Here are the steps I took to fix the issue:

1. SSH to the vCenter VM and enabled the shell
2. Ran df -h to determine which mount point seat is on - completely useless except to see that it was, indeed, full - but helpful later
3. Opened the vSphere UI on the host I know vCenter and the embedded PSC are running on. I then edited every one of the 10GB vDisks as follows: 11GB, 12GB, 13GB, etc etc until they were all completely unique.
4. There is a shell script to autogrow the changed LUNs: "/usr/lib/applmgmt/support/scripts/autogrow.sh"
5. Ran df-h again to see which of the now unique 10GB LUNs are /dev/mapper/seat_vg-seat (it was 13GB)
6. Back at the vSphere node that is running the vCenter VM and increased the 13GB volume to 55GB (thin so who cares right?)
7. Ran the autogrow shell script again
8. Ran df -h again to confirm that seat did, in fact, grow to 55GB

9. reboot

Voila! Healthy again!


/finis