SLES VM Not Booting up and Login Prompt not coming

Category: azure vm

Question

Palchak on Sat, 02 Dec 2017 04:27:10


I have an SLES Resource Manager  VM running SAP  which is  DS_15v2 and i am not getting the Login Screen

I have LVM configured with 5 External Data Disks and initially it had to do with some /etc/fstab entry and  it said in the Boot Diagnostics :

"

[[0;1;31mFAILED[0m] Failed to start File System Check on /dev/vgSAP/lvhostctrl.

See 'systemctl status systemd-fsck@dev-vgSAP-lvhostctrl.service' for details.

So i stopped and deallocated the VM and disconnected the Data Disks and then deleted the VM.

Then attached the OS VHD to another running SLES VM and mounted it there. Next, i commented all the entries in the /etc/fstab file and then unmounted the OS VHD from that VM.

Next , i used the OS VHD to create a new VM but still the VM is not coming up in Login Prompt even after rebooting 2-3 times and Redeploying the VM to another Host. I have even tried resetting the SSH configuration from the portal, still that didn't help.

The error it now shows in the Boot  Diagnostics is :

[K[ [0m [0;31m*      [0m] A start job is running for LSB: Sup...ry formats. (4min 39s / 5min 4s)
[K[ [0;1;31m* [0m [0;31m*    [0m] A start job is running for LSB: Sup...ry formats. (4min 40s / 5min 4s)
[K[ [0;31m* [0;1;31m* [0m [0;31m*    [0m] A start job is running for LSB: Sup...ry formats. (4min 41s / 5min 4s)
[K[ [0;31m* [0;1;31m* [0m [0;31m*  [0m] A start job is running for LSB: Sup...ry formats. (4min 42s / 5min 4s)
[K[  [0;31m* [0;1;31m* [0m [0;31m* [0m] A start job is running for LSB: Sup...ry formats. (4min 44s / 5min 4s)
[K[    [0;31m* [0;1;31m* [0m [0;31m* [0m] A start job is running for LSB: Sup...ry formats. (4min 45s / 5min 4s)
[K[    [0;31m* [0;1;31m* [0m] A start job is running for LSB: Sup...ry formats. (4min 46s / 5min 4s)
[K[      [0;31m* [0m] A start job is running for LSB: Sup...ry formats. (4min 47s / 5min 4s)
[K[    [0;31m* [0;1;31m* [0m] A start job is running for LSB: Sup...ry formats. (4min 48s / 5min 4s)
[K[    [0;31m* [0;1;31m* [0m [0;31m* [0m] A start job is running for LSB: Sup...ry formats. (4min 50s / 5min 4s)
[K[  [0;31m* [0;1;31m* [0m [0;31m* [0m] A start job is running for LSB: Sup...ry formats. (4min 51s / 5min 4s)
[K[ [0;31m* [0;1;31m* [0m [0;31m*  [0m] A start job is running for LSB: Sup...ry formats. (4min 52s / 5min 4s)
[K[ [0;31m* [0;1;31m* [0m [0;31m*    [0m] A start job is running for LSB: Sup...ry formats. (4min 53s / 5min 4s)
[K[ [0;1;31m* [0m [0;31m*    [0m] A start job is running for LSB: Sup...ry formats. (4min 54s / 5min 4s)
[K[ [0m [0;31m*      [0m] A start job is running for LSB: Sup...ry formats. (4min 56s / 5min 4s)
[K[ [0;1;31m* [0m [0;31m*    [0m] A start job is running for LSB: Sup...ry formats. (4min 57s / 5min 4s)
[K[ [0;31m* [0;1;31m* [0m [0;31m*    [0m] A start job is running for LSB: Sup...ry formats. (4min 58s / 5min 4s)
[K[ [0;31m* [0;1;31m* [0m [0;31m*  [0m] A start job is running for LSB: Sup...ry formats. (4min 59s / 5min 4s)
[K[  [0;31m* [0;1;31m* [0m [0;31m* [0m] A start job is running for LSB: Sup...binary formats. (5min / 5min 4s)
[K[    [0;31m* [0;1;31m* [0m [0;31m* [0m] A start job is running for LSB: Sup...ary formats. (5min 2s / 5min 4s) [K[    [0;31m* [0;1;31m* [0m] A start job is running for LSB: Sup...ary formats. (5min 3s / 5min 4s) [K[    [0;31m* [0;1;31m* [0m] A start job is running for LSB: Sup...ary formats. (5min 3s / 5min 4s)
[K[      [0;31m* [0m] A start job is running for LSB: Sup...ry formats. (5min 4s / 10min 4s)
[K[ [0;1;31mFAILED [0m] Failed to start LSB: Supports the direct execution of binary formats..
See 'systemctl status jexec.service' for details.

Can anyone tell me why this is happening and how to fix this so that i can get the Login Prompt in Putty using SSH.


Pallab Chakraborty

Replies

Nirushi J on Sat, 02 Dec 2017 10:39:53


This problem may occur if the file systems table (fstab) syntax is incorrect or if a required data disk that is mapped to an entry in the "/etc/fstab" file is not attached to the VM. You may refer the below link and let us know:

https://support.microsoft.com/en-in/help/3206699/azure-linux-vm-cannot-start-because-of-fstab-errors

https://support.microsoft.com/en-in/help/3213321/linux-recovery-cannot-ssh-to-linux-vm-due-to-file-system-errors-fsck

------------------------------------------------------------------------------------------

Do click on "Mark as Answer" on the post that helps you, this can be beneficial to other community members.

Palchak on Sat, 02 Dec 2017 14:12:12


I have performed the steps mentioned above and i added NOFAIL in the /etc/fstab entry and commented all the LVM Entries and then used the OS VHD to create again a new VM.

Now in the Boot Diagnostics it shows that the VM is sitting at the Login Prompt, but i cannot SSH still using Putty. I tried to do Telnet with the Private IP  from another VM sitting in the same subnet and still the same.

Kindly tell me why SSH is not working still using Putty and i cannot remote to the VM

Please see the Logs for reference below

  0.000000] Initializing cgroup subsys cpuset
[    0.000000] Initializing cgroup subsys cpu
[    0.000000] Initializing cgroup subsys cpuacct
[    0.000000] Linux version 4.4.49-92.11-default (geeko@buildhost) (gcc version 4.8.5 (SUSE Linux) ) #1 SMP Fri Feb 17 08:29:30 UTC 2017 (8f9478a)
[    0.000000] Command line: BOOT_IMAGE=/vmlinuz-4.4.49-92.11-default root=UUID=49f667de-606a-4200-b5be-1c54a932f223 root=/dev/disk/by-uuid/49f667de-606a-4200-b5be-1c54a932f223 disk=/dev/sda resume=swap USE_BY_UUID_DEVICE_NAMES=1 earlyprintk=ttyS0 console=ttyS0 rootdelay=300 net.ifnames=0 quiet
[    0.000000] x86/fpu: xstate_offset[2]:  576, xstate_sizes[2]:  256
[    0.000000] x86/fpu: Supporting XSAVE feature 0x01: 'x87 floating point registers'
[    0.000000] x86/fpu: Supporting XSAVE feature 0x02: 'SSE registers'
[    0.000000] x86/fpu: Supporting XSAVE feature 0x04: 'AVX registers'
[    0.000000] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format.
[    0.000000] x86/fpu: Using 'eager' FPU context switches.
[    0.000000] e820: BIOS-provided physical RAM map:
[    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
[    0.000000] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000001ffeffff] usable
[    0.000000] BIOS-e820: [mem 0x000000001fff0000-0x000000001fffefff] ACPI data
[    0.000000] BIOS-e820: [mem 0x000000001ffff000-0x000000001fffffff] ACPI NVS
[    0.000000] BIOS-e820: [mem 0x0000000100000000-0x0000000fdfffffff] usable
[    0.000000] BIOS-e820: [mem 0x0000001000000000-0x000000127fffffff] usable
[    0.000000] BIOS-e820: [mem 0x0000001280200000-0x00000024001fffff] usable
[    0.000000] bootconsole [earlyser0] enabled
[FAILED] Failed to start Entropy Daemon based on the HAVEGE algorithm.
See 'systemctl status haveged.service' for details.
[  OK  ] Started Create Volatile Files and Directories.
         Starting Update UTMP about System Boot/Shutdown...
[  OK  ] Started LSB: AppArmor initialization.
[  OK  ] Started Update UTMP about System Boot/Shutdown.
[  OK  ] Started udev Kernel Device Manager.
[  OK  ] Found device /dev/ttyS0.
[FAILED] Failed to start Entropy Daemon based on the HAVEGE algorithm.
See 'systemctl status haveged.service' for details.
[   13.704254] blk_update_request: I/O error, dev fd0, sector 0
[   13.711642] piix4_smbus 0000:00:07.3: SMBus base address uninitialized - upgrade BIOS or use force_addr=0xaddr
[FAILED] Failed to start Entropy Daemon based on the HAVEGE algorithm.
See 'systemctl status haveged.service' for details.
[   13.996162] intel_rapl: no valid rapl domains found in package 0
[  OK  ] Found device Virtual_Disk BOOT.
[   14.041551] intel_rapl: no valid rapl domains found in package 0
         Starting File System Check on /dev/...7-b7e7-4c5d-8631-036e13a411d2...
[   14.122864] intel_rapl: no valid rapl domains found in package 0
[   14.170476] intel_rapl: no valid rapl domains found in package 0
[   14.219126] intel_rapl: no valid rapl domains found in package 0
[   14.261229] intel_rapl: no valid rapl domains found in package 0
[  OK  ] Started File System Check on /dev/d...5b7-b7e7-4c5d-8631-036e13a411d2.
         Mounting /boot...
[  OK  ] Mounted /boot.
         Starting Restore /run/initramfs on shutdown...
         Starting Apply Kernel Variables...
[   14.623310] ip_local_port_range: prefer different parity for start/end values.
[  OK  ] Started Restore /run/initramfs on shutdown.
[  OK  ] Started Apply Kernel Variables.
[  OK  ] Reached target System Initialization.
[  OK  ] Started Discard unused blocks once a week.
[  OK  ] Listening on Open-iSCSI iscsid Socket.
[  OK  ] Listening on UUID daemon activation socket.
[  OK  ] Listening on D-Bus System Message Bus Socket.
[  OK  ] Reached target Sockets.
[  OK  ] Reached target Basic System.
[  OK  ] Started D-Bus System Message Bus.
         Starting Name Service Cache Daemon...
         Starting Hyper-V VSS Daemon...
         Starting wicked AutoIPv4 supplicant service...
         Starting LSB: Supports the direct execution of binary formats....
[  OK  ] Started irqbalance daemon.
         Starting /etc/init.d/boot.local Compatibility...
         Starting System Logging Service...
         Starting wicked DHCPv4 supplicant service...
         Starting wicked DHCPv6 supplicant service...
[  OK  ] Started Daily Cleanup of Temporary Directories.
[  OK  ] Reached target Timers.
[  OK  ] Started Name Service Cache Daemon.
[  OK  ] Started Hyper-V VSS Daemon.
[  OK  ] Started /etc/init.d/boot.local Compatibility.
[  OK  ] Started wicked AutoIPv4 supplicant service.
[  OK  ] Started wicked DHCPv4 supplicant service.
[  OK  ] Started wicked DHCPv6 supplicant service.
         Starting wicked network management service daemon...
         Mounting Arbitrary Executable File Formats File System...
[  OK  ] Reached target User and Group Name Lookups.
         Starting Permit User Sessions...
         Starting Login Service...
[  OK  ] Reached target Host and Network Name Lookups.
[  OK  ] Mounted Arbitrary Executable File Formats File System.
[  OK  ] Started Permit User Sessions.
[  OK  ] Started LSB: Supports the direct execution of binary formats..
[  OK  ] Started wicked network management service daemon.
[  OK  ] Started Login Service.
         Starting wicked network nanny service...
[  OK  ] Started Serial Getty on ttyS0.
[  OK  ] Started Getty on tty1.
[  OK  ] Reached target Login Prompts.
[  OK  ] Started /etc/init.d/after.local Compatibility.
[  OK  ] Started wicked network nanny service.
         Starting wicked managed network interfaces...
[  OK  ] Started System Logging Service.


Welcome to SUSE Linux Enterprise Server 12 SP2  (x86_64) - Kernel 4.4.49-92.11-default (ttyS0).


mel-db3-demo-hana login: 

Welcome to SUSE Linux Enterprise Server 12 SP2  (x86_64) - Kernel 4.4.49-92.11-default (ttyS0).


mel-db3-demo-hana login: 

Md Shihab on Mon, 04 Dec 2017 04:44:51


This issue would require deeper technical troubleshooting so I suggest you create an azure support request at the earliest for specialized assistance.

-----------------------------------------------------------------------------------------------------

Do click on "Mark as Answer" on the post that helps you, this can be beneficial to other community members.

Palchak on Thu, 07 Dec 2017 03:45:57


I raised a ticket with Microsoft Technical Support and Linux Support was involved. After troubleshooting and checking everything by attaching the OS VHD to another working Linux machine and then creating a new VM again from the same OS VHD, the  machine was coming but still i wasn't able to do SSH to the Server.

What was found out from the command history for the VM  in .bash_history was that one SAP user has ran the command " chmod 666 / " and that messed up the permission of the root directory.

So the Linux support person informed that that's the reason the network scripts weren't loading because of a corrupted permission in the root file system.

So i asked him what is the best way to not give root access to a normal user and he said just create normal user account and don't provide root credentials to any normal  users.

But he also told that if any normal user wants to run a command that needs elevated permission, he/she can just run sudo su and then provide his/her login password and the command would execute with root privileges.

He informed that this thing is very much specific to Azure Linux VMs, in normal on Premise kind of situations, running sudo su and then normal password, won't allow a user to run chmod 666 / and there is no chance of corrupting the root file system.

Is this correct what the tech support guy told? If that's the case then it doesn't matter whether i provide root credentials to a normal user or do not, he/she can do anything by running sudo su and providing his password

Nirushi J on Thu, 07 Dec 2017 12:37:28


Could you help us with the support ticket Number.

------------------------------------------------------------------------------------------

Do click on "Mark as Answer" on the post that helps you, this can be beneficial to other community members.


Palchak on Fri, 08 Dec 2017 04:06:09


Ticket # is  117120117250939 and Abdul Muqeet Mohd was the tech support person.

He only provided me the information about the Root Credential and access permission difference in an on premise Linux instance compare to Cloud which i mentioned in my last thread.

Could you please confirm to me whether that's correct or not and how can i restrict a normal user in Azure in SLES or RHEL so that they cannot run such chmod commands using sudo su "User Credentials" but can do other work.

Thanks

Nirushi J on Tue, 12 Dec 2017 17:12:54


Yes, you can restrict the user in Azure using the sudoers. You may refer the below link:

https://www.techrepublic.com/article/limiting-root-access-with-sudo-part-1/  

Additonally,for the Linux images, if you use the Azure portal, ‘azureuser’ is given as a default user name, but you can change this by using ‘From Gallery’ instead of ‘Quick Create’ as the way to create the virtual machine. Using ‘From Gallery’ also lets you decide whether to use a password, an SSH key, or both to log you in. The user account is a non-privileged user that has ‘sudo’ access to run privileged commands. The ‘root’ account is disabled.

Refer the below link:

https://docs.microsoft.com/en-us/azure/virtual-machines/linux/classic/faq  

If you wish, you can always re-open the support ticket(117120117250939 ) for more queries and information on the above issue.

Disclaimer: This response contains a reference to a third party World Wide Web site. Microsoft is providing this information as a convenience to you. Microsoft does not control these sites and has not tested any software or information found on these sites; therefore, Microsoft cannot make any representations regarding the quality, safety, or suitability of any software or information found there. There are inherent dangers in the use of any software found on the Internet, and Microsoft cautions you to make sure that you completely understand the risk before retrieving any software from the Internet.

--------------------------------------------------------------------------------------------------

Do click on "Mark as Answer" on the post that helps you, this can be beneficial to other community members