HyperCloud Customer Cluster Deployment
Danger
This procedure should not be followed without the assistance of a SoftIron Solutions Architect!
Info
This deployment guide is intended for the deployment of a new cluster from SoftIron. If reusing nodes from an existing HyperCloud or HyperDrive environment, you must wipe the data disks prior to beginning from a running OS (i.e. wipefs -a -f /dev/XXXX
, where XXXX
= the disk to wipe). For a HyperDrive cluster conversion, SoftIron can provide, by request, replacement HyperCloud static node boot drives, imaged with the latest HyperCloud release. No other nodes need a boot drive in a HyperCloud cluster; and thus, they do not need to be replaced and should be removed from all nodes prior to beginning.
Notice
Any hardware components faults or failures (HDDs, SSDs, NICs, RAM, etc.) should be resolved prior to starting a HyperCloud conversion from a HyperDrive system. Upgrading a cluster with failed hardware from HyperDrive to HyperCloud is not a supported operation.
Equipment Recommended
- Laptop
- USB A to C Converter(s) or hub
- USB C Male to 4 USB A Female - Basesailor - This four port USB A to C hub works great with macOS
- USB Network Adapter
- USB 3.0 to 10/100/1000 Gb Ethernet Adapter - Amazon Basics - Works well, but needs new releases of macOS
- USB C to 2.5Gb Ethernet Adapter - Belkin - This adapter is well supported by all releases of macOS
- A Cat 5 or 6 RJ45 cable, ~2 to 3 meters
- Either a USB extension cable or a console longer than the supplied blue console cables (USB-A to USB-B)
- This will allow sitting equipment on the floor and retain connection to the top of the rack
- The console cables that come in the boxes with the switches can be finicky
- Cisco Rollover cable
- May be useful to have an M.2 to USB adapter
- In-rack
PDU
power adapter (country dependent) in case the laptop needs to charge
Equipment Needed
-
Cables Required:
- Interconnect connections:
- 2 - 0.5 meter 100 Gb DAC cables (interconnect to interconnect peer links)
- 3 - 0.5 meter 1 Gb copper cables (interconnect management links)
- 2 - 0.5 meter 25 Gb DAC cables (management to high speed interconnect links)
- 6 - 3 meters C13 to C14 in-rack PDU power cables (need long cables to route from front to back of rack)
- Per node connections (multiply by number of nodes)
- 2 - 3 meters 25 Gb DAC cables
- 1 - 3 meters 1 Gb copper cable
- 2 - 2 meters C13 to C14 in-rack PDU power cables
In addition, be sure to have two fiber optic cables on hand for uplink selected to plug into the uplink to the customer fabric. Customers should provide patch cables and their side optical modules.
- Interconnect connections:
Pre-Startup Check
- All drives are present
- All NICs are present and in working order
- All firmware is up-to-date
- NIC
- UEFI
- BMC
- SSD
Software Needed
-
Latest HyperCloud OS Image:
-
Latest approved Edge-Core SONiC Image:
-
Latest SoftIron Interconnect Configuration Scripts:
Notice
These files are SoftIron internal-only, if customer or partner deployed, please open a ticket with SoftIron to provide these files.
Step 0 - Discovery
This information is required prior to a site visit.
Warning
Recommended VLAN IDs unless there is a very good reason to use something different:
- 3968 for Storage
- 3969 for Compute
Layer 2 Information:
-
The VLAN ID for OOB
- All node BMCs, and the Interconnects' management IPs
-
The VLAN ID HyperCloud Management access (i.e. the VLAN the Dashboard GUI will be presented on)
- Can be the same VLAN ID as the OOB network
- This will be the VLAN IDs from the customer for the HyperCloud Dashboard
-
Any other VLAN IDs the client may want to use for VMs
Layer 3 Information:
- IP for the Dashboard
- IPs for OOB
- Three for the interconnects (One IP for each)
- N for BMCs (usually only one, or one per node)
- For each VLAN ID + IP range: Netmask is required
- DNS, NTP, Syslog, and Gateway IP are recommended
Uplink:
We require the customer to be able to configure two links in a LACP port channel, trunking all needed VLANs (but especially HyperCloud Management and OOB), ideally this would be between two separate switches on the customer side for redundancy.
Uplink Physical Information:
Port Type: Example: single mode fiber, multi-mode fiber, Direct Attach Copper, DAC
Port Speed:
1, 10, 25, 40, or 100 Gb/s
Forward Error Correction FEC
Protocol, if applicable (usually, only on 40 / 100 Gb/s)
We recommend fiber uplinks to the customer fabric, to avoid SFP
compatibility issues — and we provide our side of the connection. The customer would supply their own side of the connection.
Info
Per set of interconnects, one of the following choices is required for the fiber uplinks, in order of preference / performance:
- Two - 100 Gb/s Multi-mode QSFP28 Modules
-
Two - 100 Gb/s Single-mode QSFP28 Modules
-
Two - 40 Gb/s Multi-mode QSFP+ Modules
-
Two - 40 Gb/s Single-mode QSFP+ Modules
-
Two - 25 Gb/s Multi-mode SFP28 Modules
-
Two - 25 Gb/s Single-mode SFP28 Modules
-
Two - 10 Gb/s Multi-mode SFP+ Modules
- Two - 10 Gb/s Single-mode SFP+ Modules
-
Two - 10 Gb/s Copper SFP+ Modules
-
Two - 1 Gb/s Multi-mode SFP Modules
- Two - 1 Gb/s Single-mode SFP Modules
- Two - 1 Gb/s Copper SFP Modules
Installation
Step 1 - Racking
Suggested layout scheme (from top, working downward)
Note
Customer may have an opinion.
- Management switch at top
- Interconnects
- Compute
- Three Static Nodes
- Storage
Ensure that only the three (3) static nodes have M.2 drives installed and have been flashed with the latest HyperCloud disk image, see Flashing M.2 SDDs.
- Double-check that all other nodes' boot drives are removed
- Static nodes are three "fastest" storage nodes in the cluster, and typically have the most RAM.
Example
If a cluster is made up of 8 - HD11120s and 3 - HD21216s, the 3 - HD21216s should be selected as the "static" nodes. A reasonable rule of thumb is that the third and fourth digit represents the node class, with the higher number representing the "faster" node.
Step 2 - Cabling
The last job, before verifying the setup's success, is to connect the uplinks to the customer infrastructure.
Port Tables
Information
Physical Port Number | Logical Port Number (in SONiC) | Breakout Membership |
---|---|---|
1 | Ethernet0 | Ethernet5 |
2 | Ethernet1 | Ethernet5 |
3 | Ethernet2 | Ethernet5 |
4 | Ethernet3 | Ethernet6 |
5 | Ethernet4 | Ethernet6 |
6 | Ethernet5 | Ethernet5 |
7 | Ethernet6 | Ethernet6 |
8 | Ethernet7 | Ethernet10 |
9 | Ethernet8 | Ethernet6 |
10 | Ethernet9 | Ethernet10 |
11 | Ethernet10 | Ethernet10 |
12 | Ethernet11 | Ethernet10 |
13 | Ethernet12 | Ethernet17 |
14 | Ethernet13 | Ethernet17 |
15 | Ethernet14 | Ethernet17 |
16 | Ethernet15 | Ethernet18 |
17 | Ethernet16 | Ethernet18 |
18 | Ethernet17 | Ethernet17 |
19 | Ethernet18 | Ethernet18 |
20 | Ethernet19 | Ethernet20 |
21 | Ethernet20 | Ethernet18 |
22 | Ethernet21 | Ethernet20 |
23 | Ethernet22 | Ethernet20 |
24 | Ethernet23 | Ethernet20 |
25 | Ethernet24 | Ethernet29 |
26 | Ethernet25 | Ethernet29 |
27 | Ethernet26 | Ethernet29 |
28 | Ethernet27 | Ethernet30 |
29 | Ethernet28 | Ethernet30 |
30 | Ethernet29 | Ethernet29 |
31 | Ethernet30 | Ethernet30 |
32 | Ethernet31 | Ethernet34 |
33 | Ethernet32 | Ethernet30 |
34 | Ethernet33 | Ethernet34 |
35 | Ethernet34 | Ethernet34 |
36 | Ethernet35 | Ethernet34 |
37 | Ethernet36 | Ethernet41 |
38 | Ethernet37 | Ethernet41 |
39 | Ethernet38 | Ethernet41 |
40 | Ethernet39 | Ethernet39 |
41 | Ethernet40 | Ethernet39 |
42 | Ethernet41 | Ethernet41 |
43 | Ethernet42 | Ethernet39 |
44 | Ethernet43 | Ethernet43 |
45 | Ethernet44 | Ethernet39 |
46 | Ethernet45 | Ethernet43 |
47 | Ethernet46 | Ethernet43 |
48 | Ethernet47 | Ethernet43 |
49 | Ethernet48 | Ethernet48 |
50 | Ethernet52 | Ethernet52 |
51 | Ethernet56 | Ethernet56 |
52 | Ethernet60 | Ethernet60 |
53 | Ethernet64 | Ethernet64 |
54 | Ethernet68 | Ethernet68 |
55 | Ethernet72 | Ethernet72 |
56 | Ethernet76 | Ethernet76 |
Physical Port Number | Logical Port Number (in SONiC) | Breakout Membership |
---|---|---|
1 | Ethernet0 | Ethernet0 |
2 | Ethernet1 | Ethernet1 |
3 | Ethernet2 | Ethernet2 |
4 | Ethernet3 | Ethernet3 |
5 | Ethernet4 | Ethernet4 |
6 | Ethernet5 | Ethernet5 |
7 | Ethernet6 | Ethernet6 |
8 | Ethernet7 | Ethernet7 |
9 | Ethernet8 | Ethernet8 |
10 | Ethernet9 | Ethernet9 |
11 | Ethernet10 | Ethernet10 |
12 | Ethernet11 | Ethernet11 |
13 | Ethernet12 | Ethernet12 |
14 | Ethernet13 | Ethernet13 |
15 | Ethernet14 | Ethernet14 |
16 | Ethernet15 | Ethernet15 |
17 | Ethernet16 | Ethernet16 |
18 | Ethernet17 | Ethernet17 |
19 | Ethernet18 | Ethernet18 |
20 | Ethernet19 | Ethernet19 |
21 | Ethernet20 | Ethernet20 |
22 | Ethernet21 | Ethernet21 |
23 | Ethernet22 | Ethernet22 |
24 | Ethernet23 | Ethernet23 |
25 | Ethernet24 | Ethernet24 |
26 | Ethernet25 | Ethernet25 |
27 | Ethernet26 | Ethernet26 |
28 | Ethernet27 | Ethernet27 |
29 | Ethernet28 | Ethernet28 |
30 | Ethernet29 | Ethernet29 |
31 | Ethernet30 | Ethernet30 |
32 | Ethernet31 | Ethernet31 |
33 | Ethernet32 | Ethernet32 |
34 | Ethernet33 | Ethernet33 |
35 | Ethernet34 | Ethernet34 |
36 | Ethernet35 | Ethernet35 |
37 | Ethernet36 | Ethernet36 |
38 | Ethernet37 | Ethernet37 |
39 | Ethernet38 | Ethernet38 |
40 | Ethernet39 | Ethernet39 |
41 | Ethernet40 | Ethernet40 |
42 | Ethernet41 | Ethernet41 |
43 | Ethernet42 | Ethernet42 |
44 | Ethernet43 | Ethernet43 |
45 | Ethernet44 | Ethernet44 |
46 | Ethernet45 | Ethernet45 |
47 | Ethernet46 | Ethernet46 |
48 | Ethernet47 | Ethernet47 |
49 | (25Gb) | Ethernet48 |
50 | (25Gb) | Ethernet49 |
51 | (25Gb) | Ethernet50 |
52 | (25Gb) | Ethernet51 |
53 | (100Gb) | Ethernet52 |
54 | (100Gb) | Ethernet56 |
Procedure
-
Bundling the cables into groups of four will make it easier to manage later (use the velcro ties from the DACs, etc.)
-
Power Supply Units (PSUs) can be connected at any time and status lights can be verified when connections are made.
Note
For the ports:
- The default example assumes up to 18 compute nodes and 18 storage nodes. Depending on cluster topology deployed, port selection may need to be modified to account for node counts.
-
The switch ports 1 - 18 (0 - 17 logical port mapping) are reserved for storage nodes.
-
Ports 19 - 36 (18 - 35 logical port mapping) are reserved for the compute nodes.
-
Port 46 is the default port for the uplink to the Management switch. Plug into ports 50 and 51 of the Management switch.
-
Ports 72 and 76 are the 100 Gb/s cross-links between the High Speed switches, as defined in the software; however, the physical ports are 55 and 56. Plug Port 55 on High Speed switch 1 into Port 55 on High Speed switch 2. Do the same for Port 56.
Info
The SONiC switch mapping starts the port count at 0, not 1 as on the physical label, and it counts each of the 100 Gb ports as 4, which is why the port numbers are 72 and 76, even though they are physically adjacent.
-
Finally, the uplink(s) to the customer infrastructure will use either the last 25 Gb port (48), or the first 100 Gb port (49).
- The interconnect management ports can be routed to ports 46, 47, and 48 on the Management interconnect. (This includes the management port on the management interconnect itself)
Step 3 - Configure Interconnects
Note
The interconnects shipped run SONiC today. This will change in a future release of the HyperCloud product. The interconnects are NOT intended to be used as standard switches.
Info
If any issues occur during the configuration of the interconnects, follow the link below to reset the interconnects to their default state and restart the configuration procedure:
[Enterprise SONiC] Reset default configuration
To check for errors, run echo $?
after the running the configuration scripts. In the event of a non-zero exit status, the following steps will wipe the interconnect and the process can be restarted:
-
Connect to the Management Interconnect via either the USB A or Serial port on the left side of the device with a serial baud rate of 115200 baud. The default credentials are:
admin:YourPaSsWoRd
. -
Verify the interconnect is running the latest approved SONiC firmware (verify the switch boots into SONiC console, then login and run
show version
.)-
If the interconnect is not running SONiC (boots to ONIE) OR if the interconnect is running an alternative release of SONiC, install the firmware image on the interconnect:
-
-
Leave default passwords in place (at least until handover)
- Confirm gathered customer information
-
Untar the SI SW Interconnect Configurations Tarball locally and generate the initial configurations as detailed in the
README.md
file, found under thedocs/
subdirectory in the tarball.Tarball contents
./ ./opt/ ./opt/softiron/ ./opt/softiron/switch-config/ ./opt/softiron/switch-config/docs/ ./opt/softiron/switch-config/docs/README.md ./opt/softiron/switch-config/EXAMPLE/ ./opt/softiron/switch-config/EXAMPLE/cluster.vars ./opt/softiron/switch-config/EXAMPLE/mgmt-switch-oob.vars ./opt/softiron/switch-config/EXAMPLE/high-speed-switch.vars ./opt/softiron/switch-config/edgecore/ ./opt/softiron/switch-config/edgecore/4630/ ./opt/softiron/switch-config/edgecore/4630/bin/ ./opt/softiron/switch-config/edgecore/4630/bin/oob.sh ./opt/softiron/switch-config/edgecore/7326/ ./opt/softiron/switch-config/edgecore/7326/bin/ ./opt/softiron/switch-config/edgecore/7326/bin/high-speed-switch.sh
-
Create the variable files
- There are templates in the tarball that can be edited for each use case
- Ensure that the variable files are in the same location as the OOB and High Speed Interconnect scripts
Cluster Configuration Variables:
Note
Customer VLANs are listed as space-separated list. If there is only one VLAN ID, make it the OOB VLAN ID.
High Speed Configuration Variables:
Management Configuration Variables:
Script compression and encoding
The steps below will allow larger interconnect configurations to be compressed and sent over serial at a baud rate of 115200. Follow for management and high speed interconnect configuration scripts.
-
Generate the
up.sh
script locally before copying over to the Management Interconnect.-
Execute the OOB script to generate the commands that will be used to configure the management interconnect.
- The command below will create and store the commands in a new file that can then be copied to the management interconnect
-
-
Generate the SHA-256 hash of the script:
sha256sum up.sh
- Compress and BASE64 encode the
up.sh
script via: - Copy the output
- Create a new
up.b64
file on the interconnect and paste the string from above - Decode and uncompress the
up.b64
file with: - Generate the SHA-256 hash of the
up.sh
script on the interconnect: -
Verify that the checksums from both the source and destination match EXACTLY
If not moving the file from above to the interconnect, the contents will need to be copied over into file created, start by printing the contents to the screen via the command below:
- On the management interconnect, create a new file, paste the contents from the cat command above, save, and make executable.
To accomplish the steps above, run the following command to create the script file:
PASTE (
ctrl-v
or⌘-v
) the copied text, then press escape, then type:wq
, finally, press enter or return. -
The script will be made executable via:
chmod 755 up.sh
If the console will not execute the commands, the privileges may need to be elevated with
sudo
; therefore, execute the commands assudo chmod 755 up.sh
orsudo ./up.sh
or etc.- Example of contents below that will be pasted into the file:
sudo sh -c 'cat /etc/rc.local | head -n -1 > /etc/rc.local.new' echo "/usr/sbin/ifconfig eth0 hw ether \$(/usr/sbin/ifconfig eth0 | grep ether | awk '{print \$2}' | awk -F: '{if (\$5==\"ff\") {\$5=\"00\"} else {\$5=sprintf(\"%02x\",(\"0x\"\$5)+1)} ; print \$1\":\"\$2\":\"\$3\":\"\$4\":\"\$5\":\"\$6\"\"}')" > /tmp/hwaddr-tmp-replace sudo sh -c 'cat /tmp/hwaddr-tmp-replace >> /etc/rc.local.new' sudo sh -c 'echo "exit 0" >> /etc/rc.local.new' sudo mv /etc/rc.local.new /etc/rc.local sudo chmod 755 /etc/rc.local rm -f /tmp/hwaddr-tmp-replace sudo config vlan add 1 sudo config interface ip add eth0 10.1.2.23/24 10.1.2.1 sudo config portchannel add PortChannel0 --fallback=true --lacp-key=1 sudo config portchannel member add PortChannel0 Ethernet50 sudo config portchannel member add PortChannel0 Ethernet51 sudo config vlan member add -u 1 PortChannel0 sudo config vlan member add -u 1 Ethernet0 sudo config vlan member add -u 1 Ethernet1 . . . sudo config vlan member add -u 1 Ethernet45 sudo config vlan member add -u 1 Ethernet46 sudo config vlan member add -u 1 Ethernet47 sudo config save -y
- Example of contents below that will be pasted into the file:
-
This newly created script can now be executed
./up.sh
and afterwards the management interconnect will be configured.The console may output failure of name resolution when configuring the interconnects, this can be ignored as there is no DNS in place to resolve these targets.
- It can verified that the script ran with no errors with the command
echo $?
and a result of0
- Follow up with a sync and reboot of the interconnect:
sync && sync && sync && sudo reboot
- It can verified that the script ran with no errors with the command
-
The High Speed Interconnect configuration is next.
- Ensure that the
high-speed-switch.sh
,high-speed-switch.vars
, and thecluster.vars
files are all in the same directory on the local machine. - Ensure the variables have been filled out with the correct information for the deployment.
- One interconnect will be the primary and the other will be the secondary.
- Ensure that the
-
Execute the script for the primary High Speed Interconnect to generate the commands that will configure the interconnect:
- Again, this command below will create and store the commands in a new file that will then be copied onto the Primary High Speed interconnect:
./high-speed-switch.sh up primary > primary-hss.sh
cat primary-hss.sh
- Or, on the Primary High Speed Interconnect, create a new file and paste the contents from the
cat
command above, save it, and make it executable.
vi primary-hss.sh
PASTE the copied commands, then press Escape, then
:wq
, followed by Enterchmod 755 primary-hss.sh
-
Execute this new script on the interconnect, followed by sync and reboot:
./primary-hss.sh
sync && sync && sync && sudo reboot
-
Now the secondary High Speed Interconnect will be configured, the same process as the primary High Speed Interconnect:
- Again, this command below will create and store the commands in a new file that will then be copied onto the Secondary High Speed Interconnect:
./high-speed-switch.sh up secondary > secondary-hss.sh
cat secondary-hss.sh
- On the Secondary High Speed Interconnect, create and new file and paste the contents form the
cat
command above, save it, and make it executable.
vi secondary-hss.sh
PASTE the copied commands, then press Escape, then
:wq
, followed by Enterchmod 755 secondary-hss.sh
-
Execute this new script on the interconnect, followed by sync and reboot:
./secondary-hss.sh
sync && sync && sync && sudo reboot
-
Set customer uplink speed
Only required if customer uplink uses SFP28 ports and does not support 25 Gb/s OR customer uplink uses QSFP28 ports and does not support 100 Gb/s.
- If using SFP28 ports and customer uplink does not support 25 Gb/s (i.e. 10 Gb/s only):
sudo config interface breakout Ethernet43 '4x10G[1G]'
This changes the port speed to 10 Gb/s on all children interfaces: - Child ports:
Ethernet43
Ethernet45
Ethernet46
Ethernet47
- If using QSFP28 ports and customer uplink does not support 100 Gb/s (i.e. 40 Gb/s only):
sudo config interface speed Ethernet48 40000
- If using SFP28 ports and customer uplink does not support 25 Gb/s (i.e. 10 Gb/s only):
-
Save the configuration
sudo config save -y
-
Reboot and ensure everything has been configured properly with no errors.
-
Ensure everything is healthy so far (i.e. on SONiC).
show mclag brief
show vlan config
(or,show vlan brief
)
- Backup the configurations to be stored in Git later
scp admin@[interconnect]:/etc/sonic/config_db.json [somewhere_local]
Step 4 - Install the Storage Cluster
Warning
To allow faster cluster convergence on an initial build, it is recommended to set the hardware clock from the UEFI to a time and date as closely together as possible. If the clocks are far in the past, such as if the CMOS battery has failed, then cluster convergence may take several hours.
-
Power up the first static node with the serial console cable attached, configured to a baud rate of 115200.
-
Choose to install static node 1
-
If this is the first time booting static node 1, the node will look into the boot disk and determine if the system time has been set. If it is determined that the system time has never been set, the user will be prompted to respond to the query below:
The current system time is: 16:24:33 2023/10/25 Please choose one of the following: 1: Change the system time 2: Keep the current system time > 1 Enter a new system time with the format: hour:minute:seconds year/month/day For example: 23:08:41 2023/09/21 The system time must be UTC > 16:25:30 2023/10/25 New system time is : 16:25:30 2023/10/25 Would you like to apply the new system time? 1: Apply the new system time 2: Change the new system time > 1
-
Once the user either confirms the system time or manually inputs the time in the specific format, the system will record this time and no longer prompt for this information in future deployments.
- Wait for it to run through the remaining setup and checks.
- Run
lsb_release -a
to verify release information.
-
-
Continue with the second static node
- Choose to install static node 2
-
Continue with the third static node
- Choose to install static node 3
-
Watch for the OSDs to be created
Note
Depending on how much time you have, you can now power on the rest of the storage nodes - make sure they have PXE boot configured correctly.
Info
If the node was part of a previous cluster build, HyperDrive or HyperCloud, then HyperCloud may refuse to wipe an existing data disk and ingest it into the cluster. If you are unable to boot another OS to wipe the data disks, then you can create a cluster control fact to nuke the disks by running the following command:
touch /var/run/cluster-control/facts/clusternode_NODENAME_nuke_disks
, replacingNODENAME
with the HyperCloud hostname of the machine in question.- Run
ceph -s
to verify
- Run
-
Once all the storage nodes are up: run
hypercloud-manage
on any node and enter the following information:Compute (KVM) VLAN ID: Storage VLAN ID: Dashboard system VLAN ID (optional): Dashboard system IP Address (optional): Dashboard system netmask: Dashboard system default gateway (optional): Dashboard system DNS servers (optional): Dashboard system Syslog servers (optional): Dashboard system NTP servers (optional):
-
Choose the remaining options in sequential order (2, 3, & 4) to show the proposed cluster changes (input values), commit the changes, and quit the HyperCloud Cluster Manager, respectively.
Info
If modifications are needed to the cluster variables, option 1 can be chosen again to re-enter the information or, option 4 can be chosen prior to committing the changes with option 3 to discard them.
Step 5 - Install the Compute Cluster
-
Power up the nodes one by one, ensuring they PXE boot
Info
There is no need to wait for them to boot fully.
-
Check that the HyperCloud dashboard comes up.
Info
- On the first compute node you should see the dashboard with
virsh list
. - You can connect to the command line of the dashboard with
hypercloud-dashboard-console
from that hostname.
- On the first compute node you should see the dashboard with
-
You now have a running HyperCloud cluster
-
You can pull the admin:password login from this dashboard node as well via:
cat /var/lib/one/.one/one_auth
-
HyperCloud Dashboard can be configured via the Cloud Management GUI at
https://<dashboard_ip>/cloudmanagement/
including adding custom SSH keys for authentication.
Step 6 - Connect to the Customer Infrastructure
- Plug in the customer uplinks
-
Troubleshoot
Warning
This is typically the hardest part of the install, as it requires interfacing with the customer fabric. Up until this point, the entire HyperCloud installation is self-contained.
If the customer side has not set up LACP properly, you may have to disconnect the uplink to the secondary switch to allow traffic to pass, as the primary switch allows for LACP bypass.
- At this point, it would be VERY helpful to have a dump of the port configurations from the customer-side switch fabric.
- Do they have LACP enabled on their side?
- Is LACP on their side configured at “Rate: Slow”?
- Are all VLANs trunked that are needed from the customer side?
- Does the physical link layer come up? (i.e. Do you have a link light?)
- If not, double check interface speeds and/or
FEC
settings both sides. On the Edge-Core side, see: SONiC Port Attributes
- If not, double check interface speeds and/or
Step 7 - Verify Customer can reach HyperCloud Cluster
Log in and help with first steps of having an empty cluster.
See HyperCloud Documentation - User Guide
Notes
-
To quickly set the BMC addresses (e.g.)
-
To add extra VLANs on the SONiC switches (then set up the virtual net in the dashboard)
sudo config vlan add [VLAN_ID]
- To add it tagged
sudo config vlan member add [VLAN_ID] PortChannelXX
- XX meaning for every port channel to every compute node and uplink to customer fabric